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(57) Abstract: The invention provides a protein in a form that is functional for the enzymatic conversion of 2C-methy]-D-ery- 
thritol 2,4-cyclodiphosphate to l-hydroxy-2-melhyl-2-butenyl 4-diphosphate notably in its (£)-form of the non-mevalonate biosyn- 
thetic pathway to isoprenoids. The invention also provides a protein in a form that is functional for the enzymatic conversion of 
l-hydroxy-2-melhyl-2butenyl 4-diphosphate. notably in its (£)-fonn, to isopentenyl diphosphate and/or dimethylallyl diphosphate. 
Further, screening methods for inhibitors of these proteins are provided. Further, l-hydroxy-2-methyl-2-butenyl 4-diphosphate is 
provided and chemical and enzymatic methods of its preparation. 
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Intermediates and enzymes of the non-mevalonate isoprenoid pathway 
Field of the invention 

The present invention relates to cells, cell cultures or organisms or parts thereof for the 
efficient formation of a biosynthetic product or intermediate or enzyme of a 1 -deoxy-D-xylulose 
5-phosphate-dependent biosynthetic pathway. Further, the invention relates to vectors for 
producing them. Further, the invention relates to their use for the formation or production of 
intermediates or products or enzymes of said biosynthetic pathway as well as to enzymes and 
intermediates. Further, the invention relates to the screening for inhibitors or enzymes for said 
biosynthetic pathway. 

Background of the invention 

The system of biosynthetic pathways in any organism is highly streamlined, whereby a few 
central trunk pathways branch into a great number of peripheral pathways. The central trunk 
pathways involve starting materials which are highly integrated. Therefore, central or trunk 
pathways are highly regulated. At the same time they are crucial for any attempts to interfere 
with the metabolism of any organism either by an inhibitor or by metabolic engineering. 
The isoprenoid pathways are a prime example for this metabolic organisation. They are very 
long and highly branched, leading to some 30,000 isoprenoid or terpenoid compounds. They 
all seem to derive from isopentenyl diphosphate (IPP) and dimethylallyl'diphosphate (DMAPP). 
They are produced by two alternative trunk pathways (reviewed in Eisenreich et al., 2001). 
By the classical research of Bloch, Cornforth, Lynen and co-workers, isopentenyl 
pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP) have become established as 
key intermediates in the biosynthesis of isoprenoids via mevalonate. However, many bacteria, 
plastids of all plants, and the protozoon Plasmodium falciparum synthesize IPP and DMAPP 
by an alternative pathway via 1-deoxy-D-xylulose 5-phosphate. The discovery of the pathway 
was mainly based on the incorporation of isotope-labelled 1 -deoxy-D-xylulose Into the 
Isoprenoid side chain of menaquinones from Escherichia coli (Arigoni and Schwarz, 1 999). 
This mevaionate-independent pathway has so far only been partially explored (Fig. 1). For a 
better understanding of these aspects of the invention, the pathway shall be briefly explained. 
It can be divided into three segments: 
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In a first pathway segment shown in Fig. 1, pyruvate (1) is condensed with glyceraldehyde 3- 
phosphate (2) to 1-deoxy-D-xylulose 5-phosphate (DXP) (3). Subsequently, DXP is converted 
into 2C-methyl-D-erythritol 4-phosphate (MEP) (4) by a two-step reaction comprising a 
rearrangement and a reduction. This establishes the 5-carbon isoprenoid skeleton. 
In the subsequent segment of the mevalonate-independent pathway (Fig. 1), MEP (4) is first 
condensed with CTP to 4-diphosphocytidyl-2C-methyl-D-erythritol (CDP-ME) (5) by 4- 
'diphosphocytidyl-2C-methyl-D-erythritol synthase (PCT/EPOO/07548). CDP-ME (5) is 
subsequently ATP-dependent phosphorylated by 4-diphosphocytidyl-2C-methyl-D«erythritol 
kinase yielding 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate (CDP-MEP) (6). The 
intermediate is subsequently converted into 2C-methyl-D-erythritol 2.4-cyclodiphosphate 
(cMEPP) (7) by 2C-methyl-D-erythritol 2.4-cyclodiphosphate synthase (PCT/EPOO/07548). 
These three enzymatic steps form a biosynthetic unit which activates the isoprenoid Cg- 
skeleton for the third pathway segment (Rohdich et al., 1999; Luttgen et aL, 2000; Herz et a/., 
2000). 

Bioinformatic studies (German Patent Application 1 0027821 .3), as well as studies with mutants 
of Synechocystis sp. (Cunningham et ai, 2000) and Escherichia coli (Campos et aL, 2001; 
Altincicek et a!., 2001) demonstrate the involvement of lytB and gcpE genes in the isoprenoid 
pathway. However, the function and the reaction catalyzed by the corresponding gene 
products are still unknown. 

Recently, a kinase (XylB) has been described that catalyzes the conversion of 1-deoxy-D- 
xylulose into 1-deoxy-D-xylulose 5-phosphate at high rates (Wungsintaweekul et al.. 2000). 
Genes and enzymes participating in further downstream reactions have been described. 
However, the gene functions, the intermediates, and the mechanisms leading to the products 
are still unknown. 

For numerous pathogenic eubacteria as well as for the malaria parasite P. falciparum, the 
enzymes involved in the non-mevalonate pathway are essential. The intermediates of the 
mevalonate-independent pathway cannot be assimilated from the environment by pathogenic 
eubacteria and P. falciparum. The enzymes of the alternative isoprenoid pathway do not occur 
in mammalia which synthesize their isoprenoids and terpenoids exclusively via the mevalonate 
pathway. Moreover, the idiosyncratic nature of the reactions in this pathway reduces the risk 
of cross-inhibitions with other, notably mammalian enzymes. 

Therefore, enzymes of the alternative isoprenoid pathway seem to be specially suited as 
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targets for novel agents against pathogenic microorganisms and herbicides. The elucidation of 
unknown steps and the identification of these targets, e.g. genes and cognate enzymes of 
these pathways is obligatory for this purpose. 

A further source of interest in the non-mevalonate pathway derives from the fact certain 
pathogens like Mycobacteria, Plasmodia, Escherichia etc. use this pathway to activate y5 T 
cells (Fournie and Bonneville, 1996). Therefore, y5 T cells likely act as a first line of defense 
against infections by such pathogens. Intermediates of the non-mevalonate pathway have 
been suggested to be responsible for y5 T cell activation (Jomaa et ai, 1999), Recently, it was 
show that E. coli strains lost the ability to stimulate y5 T cells when the dxr or the gcpE gene 
was knocked out (Altincicek et a!,, 2001). 

Moreover, there is a great biotechnologlcal interest in these pathways, since they lead to 
valuable vitamins and isoprenoid or terpenoid products. 

Previous attempts to approach these goals have been hampered by the low rate of 
biosynthesis along these pathways in wild-type cells studied so far. 

Summary of the invention 

It is an object of the invention to provide enzymes and nucleic acids coding for said enzymes 
as well as intermediates for the conversion of 2C-methyl-D-erythritol 2,4-cyclodiphosphate to 
isopentenyl diphosphate and/or dimethylallyl diphosphate. 

It has surprisingly been found that the intermediate in the conversion of 2C-methyl-D-erythritoI 
2,4-cyclodiphosphate to isopentenyl diphosphate and/or dimethylallyl diphosphate is 1- 
hydroxy-2-methyl-2-butenyl 4-diphosphate. This intermediate is formed by an enzyme encoded 
by gcpE as designated in the E. coli genome. It has further been found that this enzyme 
prefers as reductant NADH or NADPH. Further, it has been found that it is promoted by Co^*. 

The above intermediate is converted to isopentenyl diphosphate and/or dimethylallyl 
diphosphate by an enzyme encoded by lytB as designated in the E. co// genome. The latter 
enzyme prefers as reductant NADH or NADPH and FAD as mediator. Further it can be 
promoted by ions of a metal selected from manganese, iron, cobalt, nickel. 

With these findings, the third segment of the trunk non-mevalonate pathway is now 
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established. The key to these findings is the intermediate 1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate. notably in its £-form. This establishes the unifying principle of the invention for 
reactions to and from this intermediate. 

Further, it is an object of the invention to provide cells, cell cultures, organisms or parts thereof 
for the efficient biosynthesis of isoprenoid products or intermediates of the non-mevalonate 
biosynthetic pathway dependent on 1 -deoxy-D-xylulose 5-phosphate production from 1-deoxy- 
D-xylulose and/or glucose. 

The present invention produces a novel in vivo system which can be used for the structure 
elucidation of unknown intermediates and the assignment of biological functions of putative 
genes or cognate enzymes in the alternative isoprenoid biosynthetic pathway. As an example, 
the functional assignment of the gcpE gene (now designated as ispG) and of the lytB gene 
(now designated ispH) in the mevalonate-independent pathway of isoprenoid biosynthesis is 
achieved. 

More specifically, said in vivo system consists of recombinant E. coli strains harbouring vector 
construct(s) carrying and expressing genes for D-xylulokinase (xy/B), and genes of further 
downstream steps of terpenoid biosynthesis, such as dxs, dxr, and/or /spD, and/or /spE, and/or 
/spF, and/or gcpE, and/or lytB from E coli, and/or a carotenoid gene cluster from Erwinia 
uredovora. 

In one aspect of the invention, the genetically modified strains can be fed with 1 -deoxy-D- 
xylulose. notably with isotope-labelled 1 -deoxy-D-xylulose, which is converted at high rates into 
the common intermediate of the mevalonate-independent terpenoid pathway, 1 -deoxy-D- 
xylulose 5-phosphate, and into further intermediates of said pathway, like 2C-methyl-D- 
erythritol 4-phosphate, 4-diphosphocytidyI-2C-methyl-D-erythritol, 4-diphosphocytidyl-2C- 
methyl-D-erythritol 2-phosphate, 2C-methyl-D-erythritol 2,4-cyclodiphosphate, 1-hydroxy-2- 
methyl-2-butenyl 4-diphosphate, isopentenyl diphosphate, and dimethylallyl diphosphate. 
Further, feeding with glucose or an intermediate of glycolysis for conversion into said further 
intermediates of said pathway may be performed. 

Said systems are useful for the structure elucidation of hitherto elusive intermediates in the 
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biosynthetic pathways, for in vivo screening of novel antibiotics, antimalarials, and herbicides, 
and as a platform for the bioconversion of exogenous 1-deoxy-D-xylulose and/or glucose into 
intermediates and products of the non-mevalonate pathway of terpenoid biosynthesis. 
Said systems can also be used for screening chemical libraries for potential herbicides, and/or 
antimalarials, and/or antimicrobial substances by detecting and measuring the amount of 
certain intermediates formed in vivo in the presence or absence of potential inhibitors of the 
gene products of mevalonate-independent isoprenoid pathway genes, namely dxs, dxr, ispD, 
ispE, ispF, gcpE, and lytB. 

Said system can further be used for the production of higher isoprenoids (e.g. isoprenoids 
having 10. 15, 20, 30 or 40 carbon atoms) such as carotene, a-tocopherole or vitamins by 
boosting the bioynthesis of isopentenyl diphosphate and/or dimethylallyl diphosphate via the 
non-mevalonate pathway, e.g. by using glucose as feeding material. Further feeding materials 
which may be used are intermediates or products of glycolysis like glyceraldehyde 3- 
phosphate or pyruvate. 

Further, this invention provides novel compounds of formula I (see below), notably 1- 
hydroxy-2-methyl-2-butenyl 4-diphosphate as well as enzymatic and chemical methods for 
preparing said compounds. As demonstrated herein, (E)-1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate is produced from 2C-methyl-D-erythritol 2,4-cyclodiphosphate by the gcpE 
gene product. 

It is further demonstrated herein that (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate is 
converted to dimethylallyl diphosphate and isopentenyl diphosphate by the lytB gene 
product. 

Short description of the figures and annexes 

Figure 1: Biosynthesis of both isoprenoid precursors, isopentenyl pyrophosphate and 
dimethylallyl pyrophosphate via the mevalonate-independent pathway. 

Figure 2: Scheme of an Esctiericfiia coli in vivo system for generating optionally isotopically 
labelled intermediates of biosynthetic pathways such as the mevalonate-independent 
isoprenoid biosynthesis, and for the production of higher terpenoids such as carotenoids. 

F:UB41 56\1 56ANM\WB000203 



wo 02/083720 PCT/EP02/04005 

6 

Figure 3: NMR spectra in DgO (pH 6) obtained according to Example 25. * indicates 
impurities. 

Figure 4: Preparation of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate according to Example 24. 
Reagents and conditions were as follows: (a) DHP. PPTS. 25 (2.5 h); (b) PhaPCHCOaEt, 
toluene, reflux (39 h); (c) (1) DIBAH. CH2CI2. -78 X (3 h), (2) 1 M NaOH/HjO; (d) p-TsCI. 
DMAP, CH2CI2. 25 X (1 h); (e) ((CH3CH2CH2CH2)4N)3HP207. MeCN. 25 X (2 h); (f), HCI/H2O 
pH 1,25 X (7 min). 

Figure 5: The reaction catalyzed by the ispH (formerly lytB) gene product. 

Figure 6: The reaction catalyzed by the ispG (formerly gcpE) gene product. 

Figure 7: Chemical preparation of 3-fbrmyl-but-2-enyl 1 -diphosphate (see example 42). 

Annex A: DNA sequence of the vector construct pBSxylBdxr. 
Annex B: DNA sequence of the vector construct pBSxylBdxrispD. 
Annex C: DNA sequence of the vector construct pBScyclo. 
Annex D: DNA sequence of the vector construct pACYCgcpE. 
Annex E: DNA sequence of the vector construct pBScaro14. 
Annex F: DNA sequence of the vector construct pACYCcaro14. 

Annex G: DNA sequence and corresponding amino acid sequence of the ispG (formerly gcpE) 
gene from Escherichia coli. 

Annex H: DNA sequence of the vector construct pBScyclogcpE. 
Annex I: DNA sequence of the vector construct pACYCIytBgcpE. 

Annex J: DNA and corresponding amino acid sequence of the ispH (formerly lytB) gene from 
Escherichia coli. 

Annex K: DNA sequence of the vector construct pBScyclogcpElytB2. 

Annex L: DNA and corresponding amino acid sequence of the ispG gene (fragment) from 
Arabidopsis thaliana. 

Annex IVl: DNA and corresponding amino acid sequence of the ispG (forrmly gcpE) gene of 
Arabidopsis thaliana. 

Annex N: cDNA sequence of 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase (IspH) 
from Arabidopsis thaliana 
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Detailed description of the invention 

1-Deoxy-D-xylulose 5-phosphate is a common intermediate in the alternative terpenoid 
pathway via 2C-methyl-D-erythritol 4-phosphate. This latter pathway is operative in bacteria, 
certain protozoa and most significantly also in the plastids of plants, where it is in charge of the 
biosynthesis of a great many valuable terpenoid products, like natural rubber, carotenoids, 
menthol, menthone, camphor or paclitaxel. The alternative terpenoid pathway is now intensely 
studied. But so far only the initial steps from glyceraldehyde 3-phosphate and pyruvate via 1- 
deoxy-D-xylulose 5-phosphate and 2C-methyl-D-erythritol 4-phosphate, 4-diphosphocytidyl- 
2C-methyl-D-erythritol, 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate and 2C-methyl- 
D-erythritol 2,4-cyclodiphosphate (Fig. 1) have been elucidated. 

The intermediate 1-deoxy-D-xylulose 5-phosphate is of most crucial significance for a number 
of commercial purposes: 

(1) It may be used as a key intermediate for commercial screening procedures regarding 
potential inhibitors of downstream enzymes in the biosynthesis of the alternative terpenoid 
pathway. 

(2) It may be used as a key intermediate for the in vitro production of terpenoids or of 
intermediates thereof. 

(3) It occurs in vivo in the biosynthesis of terpenoids as an enzymatic condensation product 
of glyceraldehyde 3-phosphate and pyruvate. The latter are central intermediates of the 
metabolism and obligatory starting materials for numerous biosynthetic pathways. 
Therefore, it is desirable to generate a high level of 1-deoxy-D-xylulose 5-phosphate in 
vivo from an exogenous source and thus independent from the pools of glyceraldehyde 3- 
phosphate and pyruvate for boosting the biosynthesis of terpenoids or of intermediates 
thereof in microorganisms or cell cultures that are either naturally or recombinantly 
endowed with the pathway of interest without influencing the basic intermediary 
metabolism of the cells. 

(4) 1-Deoxy-D-xylulose 5-phosphate can be generated from 1-deoxy-D-xylulose by the 
catalytic action of the xylB gene product. Using recombinant strains comprising the xylB 
gene the reaction occurs in vivo and exogenous 1-deoxy-D-xylulose is converted into 
intracellular 1-deoxy-D-xylulose 5-phosphate at high rates. 

(5) 1-DXP can be generated fro glucose by the catalytic action of glycolytic enzymes and DXP- 

synthase. Using recombinant strains comprising the dxs gene, the reaction occurs in vivo 
and exogeneous glucose is converted to intracellular 1-DXP at high rates. 
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It is an aspect of the invention to use 1-deoxy-D-xylulose as a precursor in order to boost the 
rates of biosynthesis of 1-deoxy-D-xylulose 5-phosphate-dependent pathways. 1-Deoxy-D- 
xylulose can be prepared by various published procedures (Blagg and Poulter.1999; Kennedy 
et al.. 1995; Piel and Boland. 1997; Shono et aL, 1983; Giner, 1998). 

It is an aspect of the present invention to use 1-deoxy-D-xylulose in various Isotoplcally 
labelled forms. It may be labelled by radioactive isotopes or non-radioactive isotopes of C (^^C 
or ^"^C). H (D or T) or 0 (^^0 or ^®0) in any combination. 

Isotope-labelled 1-deoxy-D-xylulose may be prepared enzymatically using 1-deoxy-D-xylulose 
5-phosphate synthase of Bacillus subtllis and commercially available glycolytic enzymes and 
phosphatase from isotope-labelled glucose and/or pyruvate (PCT/EPOO/07548). 
1-Deoxy-D-xylulose may be used as a free acid or as a salt, preferably as an alkaline (e. g., 
lithium, sodium, potassium) salt or as an ammonium or amine salt. 

It is an aspect of the present invention to use recombinant cells, cell cultures, or organisms or 
parts thereof for the formation of biosynthetic products or intermediates or enzymes or for the 
screening for antimicrobials, antimalarials or herbicides. 

For carrying out the present invention various techniques in molecular biology, microbiology 
and recombinant DNA technology are used which are comprehensively described in Sambrock 
et aL, Molecular Cloning, second edition, Cold Spring Harbor Laboratory Press. Cold Sprind 
Harbor, New York; in DNA Cloning: A Practical Approach. Vol. 1 and 2, 1985 (D. N. Glover, 
ed.); in Oligonucleotide Synthesis. 1984 (M. L. Gait, ed.); and in Transcription and Translation 
(Hames and Higgins, eds.). 

Nucleic acids 

The present invention comprises nucleic acids which include prokaryotic, protozoal and plant 
sequences and derived sequences. A derived sequence relates to a nucleic acid sequence 
corresponding to a region of the sequence or orthologs thereof or complementary to 
"sequence-conservative" or "function-conservative" variants thereof. 

Sequences may be isolated by well known techniques or are commercially available (Clontech, 
Palto Alto. CA; Stratagene, LaJolla. CA). Alternatively, PCR-based methods can be used for 
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amplifying related sequence from cDNA or genomic DNA. 

The nucleic acids of the present invention comprise purine and pyrimidine containing polymers 
in various amounts, either polyribonucleotides or polydeoxyribonucleotides or mixed polyribo- 
polydeoxyribonucleotides. The nucleic acids may be isolated directly from cells. Alternatively, 
PGR may be used for the preparation of the nucleic acids by use of chemical synthesized 
strands or by genomic material as template. The primers used in PGR may be synthesized by 
using the sequence information provided by the present invention or from the database and 
additionally may be constructed with optionally new restriction sites in order to ease the cloning 
in a vector for recombinant expression. 

The nucleic acids or the present invention may be flanked by natural regulation sequences or 
may be associated with heterologous sequences, including promoter, enhancer, response 
elements, signal sequences, polyadenylation sequences, introns, 5'- and 3' noncoding regions 
or similar. The nucleic acids may be modified on basis of well known methods. Non-limiting 
examples for these modifications are methylations, "Gaps", substitution of one or more natural 
nucleotides with an analogue, and intemucleotide modification, i.e. those with uncharged bond 
(i.e. methylphosphonates, phosphotriester, phosphoramidates, carbamates, etc.) and with 
charged bond (i.e. phosphorothiactes, etc.). Nucleic acids may carry additional kovalent bound 
units such as proteins (i.e. nucleases, toxins, antibodies, signalpeptides, poly-L-lysine, etc.), 
intercalators (i.e. acridine, psoralene, etc.), chelators (i.e. metals, radioactive metals, iron, 
oxidative metals, etc.) and alkylators. The nucleic acids may be derived by formation of a 
methyl- or ethylphosphotriester bond or of a alkylphosphoramidate bond. Further, the nucleic 
acids of the present invention may be modified my labeling, which give an either directly or 
indirectly detectable signal. Examples for these labeling include radioisotopes, fluorescent 
molecules, biotin and so on. 

Vectors 

The invention provides nucleic acid vectors, which comprise the sequences provided by the 
present Invention or derivatives thereof. Various vectors, including plasmids or vectors for fungi 
have been described for the replication and/or expression in various eucaryotic and procaryotic 
hosts. High copy replication vectors are preferred for the purposes of the invention. Non- 
limiting examples include pKK plasmids (Glontech), pUG plasmids (Invitrogen, San Diego, GA), 
pET plasmids (Novagen, Inc., Madison, Wl) or pRSET or pREP (Invitrogen) and various 
suitable host cells on basis of well known techniques. Recombinant cloning vectors comprise 
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often more than one replication system for the cloning and expression, one or more marker for 
the selection in the host; i.e. antibiotic resistance and one or more expression cartridge. 
Suitable hosts may be transformed/transfected/infected by a method as suitable including 
electroporation, CaClj-mediated DNA incorporation, tungae infection, microinjection, 
microbombardment or other established methods. 

Suitable hosts include bacteria, archaebacteriae, fungi, notable yeast, plants, notably 
Arabidopsis thaliana, Mentha piperita or Taxus sp. and animal cells, notably mammalian cells. 
Most Important are E. co//, Bacillus subtilis, Saccharomyces cerevisiae, Saccharomyces 
carlsbergensis, Sct)izosaccharomyces pombe, SF9 cells, C129 cells, 293 cells, Neurospora, 
and CHO cells, COS cells, HeLa cells and immortalized myeloid and lymphoid mammalian 
cells. Preferred replication systems include M13. C0IEI, SV40, baculovirus. lambda, 
adenovirus and so on. A great number of transcription, initiation (including ribosomal binding 
sites) and termination regulation regions have been isolated and there efficiency for the 
transcription and translation of heterologous proteins has been demonstrated in various hosts. 
Examples for these regions, methods for the isolation, the way for using are well known. Under 
suitable conditions for expression host cells may be used as source for the recombinant 
synthesized proteins. 

Expression systems 

Preferable vectors may Include a transcription element (that is a promoter), functionally 
connected with the enzyme domain. Optionally, the promoter may include parts of operator 
region and/or ribosomal binding sites. Non-limiting examples for bacterial promoters, which are 
compatible with E. co//, include: trc promoter, p-lactamase (penicillinase) promoter; lactose 
promoter, tryptophan (trp) promoter, arabinose BAD operon-promoter, lambda-derived P1 
promoter and N gene ribosomal binding site and the hybrid Tac promoter, derived from 
sequences of trp and lac UV5 promoters. Non-limiting examples for yeast promoters include 
3-phosphoglycerate kinase promoter, glyceraldehyde 3-phosphate dehydrogenase (GAPDH) 
promoter, galactokinase (GALI) promoter, galactoepimerase promoter and 
alcoholdehydrogenase (ADH) promoter. Suitable promoters for mammalian cells include 
without limiting viral promoters such as I.e. simian virus 40 (SV40). rous sarcoma virus (RSV), 
adenovirus (ADV) and bovine papilloma virus (BPV). Mammalian cells may also need 
terminator sequences and poly-A sequences and enhancer sequences, which may increase 
the expression. Sequences, which amplify the genes, may also be preferred. Further on, 
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sequences may be included, which ease the secretion of the recombinant protein from the cell, 
which may be but non-limiting a bacterial, yeast or animal cell, such as i.e. a secretion signal 
sequence and/or prehormon sequence. 

It is an important aspect of the invention that the combined recombinant endowment with xylB 
and other gene(s) of the alternative C5-lsoprenoid pathway and optionally gene(s) for higher 
isoprenoids or terpenoids boost(s) these pathways. Preferably, xylB is combined with complete 
sets of genes to convert 1 -deoxy-D-xylulose 5-phosphate into the desired intermediate or end 
products. For Intermediates in the C5-isoprenoid pathway, cells are preferably endowed with 
one of the combinations of genes given in claim 76. 

For the genes cited herein, the common E, co// designation were used. Other genes from E. 
coli or from other organisms (orthologous genes) may also be used if they have the same 
functions (function-conservative genes), notably if their gene products catalyze the same 
reaction. Further, deletion or insertion variants or fusions of these genes with other genes or 
nucleic acids may be used, as long as these variants are function-conservative. The above 
genes may be derived from bacteria, protozoa, or from higher or lower plants, 

It is another important aspect of the invention that the function of gcpE as following 
immediately downstream from ispF has been determined. Our findings show that the gcpE 
gene product is involved in the formation of the novel compound 1-hydroxy-2-methyl-2-butenyl 
4-diphosphate from 2C-methyl-D-erythritol 2,4-cyclodiphosphate. Therefore, we rename gcpE 
in ispG, 

In a further aspect of the invention it was shown that the gene product of gcpE is involved in 
the formation of the E-isomer of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate from 2C-methyl- 
D-erythritol 2.4-cyclodiphosphate by comparison with chemically synthesized (E)- and (2)- 
isomers of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate. Therefore, this invention further 
pertains to the (E) and (Z) isomers of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate salts or 
protonated forms thereof. 

It is another important aspect of the invention that the function of lytB has been determined as 
following immediately downstream from ispG. Therefore, it is renamed ispH. It is our finding 
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that ispH is involved in the conversion of (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate into 
isopentenyl 4-diphosphate and/or dimethylallyl 4-diphosphate. 

It should be understood that „1-hydroxy-2-methyl-2-butenyl 4-phosphate" and „1-hydroxy-2- 
methyl-2-butenyl 4-diphosphate" comprise the free phosphoric and diphosphoric acids, 
respectively, and the singly or multiply deprotonated forms thereof, i.e. salts which may be 
salts of any cation (including Na, K. NH/, Li, Mg, Ca. Zn, Mn, and Co cations)^ The protonation 
state of (di)phosphates and phosphate derivatives or their conjugated acids in aqueous 
solution depends on the pH value of the solution, as is known to persons skilled in the art. The 
same applies to other phoshates or phosphate derivatives. 

In another aspect of the invention, (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate has been 
successfully incorporated into the lipid soluble fraction of Capsicum annuum chromoplasts. A 
^^C label of this compound was incorporated into the geranylgeraniol. |3-carotene, phytoene 
and phytofluene fractions of C. annuum chromoplasts establishing (£)-1-hydroxy-2-methyl-2- 
butenyl 4-diphosphate as intermediate of the non-mevalonate pathway downstream from 2C- 
methyl-D-erythritol 2,4-cyclodiphosphate and upstream from isopentenyl diphosphate. 

It is another aspect of the invention that xylB can be combined with gcpE and optionally other 
genes of the alternative C5 isoprenoid pathway and/or of the higher isoprenoid pathways in 
vector(s) for recombinant engineering. 

As a consequence of our findings regarding gcpE (now ispG) it follows that the gene lytB 
operates downstream oi gcpE and thus in service of the conversion of 1-hydroxy-2-methyl-2- 
butenyl 4-diphosphate to IPP and/or DMAPP. Therefore, it is another aspect of the invention 
to combine the gene lytB with xylB and optionally other genes of the common C5-isoprenoid 
pathway or of a higher isoprenoid pathway. 

Our finding allows the efficient formation or production of intermediates or products of the 
isoprenoid pathway with any desired labelling, notably the following intermediates: 
2C-methyl-D-erythritol 4-phosphate; 4-diphosphocytidyl-2C-methyl-D-erythritol; 4- 
diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate; 2C-methyl-D-erythritol 2.4- 
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cyclodiphosphate; 1-hydroxy-2-methyl-2-butenyl 4-diphosphate. isopentenyl diphosphate; 
dimethylallyl diphosphate. 

The formation of end products of the terpenoid pathway (e. g., (3-carotene. zeaxanthine, 
paclitaxel, menthol, menthone, cannabinoids), may be boosted following the process of the 
invention. 

The strains harbouring the recombinant plasmids can be cultivated in conventional culture 
media, preferably in terrific broth medium, at 16 to 40 ^'C. The preferred temperature is 37 "C. 
The E. coli strains are induced with 0.5 to 2 mM isopropyl-p-D-thiogalactoside (IPTG) at an 
optical density at 600 nm from 0.5 to 5. The cells are incubated after addition of 1-deoxy-D- 
xylulose at a concentration of 0.001 mM to 1 M preferably at a concentration of 0.01 to 30 mM 
for 30 min to 15 h. preferably 1 to 5 h. 

It has been found that the process of producing isoprenoid intermediates or products by the 
genetically engineered organisms of the invention can be boosted by supplying a source for 
CTP, for example cytidine and/or uridine and/or cytosine and/or uracil and/or ribose and/or 
ribose 5-phosphate and/or any biosynthetic precursors of CTP at a concentration of 0.01 to 10 
mM. preferably at a concentration of 0,3 to 1 mM, and/or by supplying a source for 
phosphorylation activity, for example glycerol 3-phosphate and/or phosphoenolpyruvate and/or 
ribose 5-phosphate at a concentration of 0.1 to 100 mM, preferably at a concentration of 0.5 
to 10 mM and/or inorganic phosphate and/or inorganic pyrophosphate at a concentration of 1 
to 500 mM. preferably at a concentration of 10 to 100 mM and/or any organic phosphate 
and/or pyrophosphate, and/or by supplying a source for reduction equivalents, for example 0. 1 
to 1000 mM, preferably 10 to 1000 mM, lactate and/or succinate and/or glycerol and/or 
glucose and/or lipids at a concentration of 0.1 to 100 mM, preferably at a concentration of 0.5 
to 10 mM. A particularly efficient production process is specified in claims 72 and 80 to 84. 

This process can also be used with great advantages for screening for inhibitors of the 
enzymes involved or of downstream enzymes, dependent on the choice of the isoprenoid 
intermediate or product for detection. The enzymes dxs, dxr, /spD, /spE, ispF, ispG (formerly 
gcpE) and ispH (formerly lytB) do not occur in animals. Therefore inhibitors against cfxs. c/xr, 
ispD, ispE, ispF, ispG (formerly gcpE) and ispH (formeriy lytB) have great value as (a) 
herbicides against weed plants or algae; (b) antibiotic agents against pathogenic bacteria; (c) 
agents against protozoa, like Plasmodium falciparum, the causative pathogen of malaria. 
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The activity of the said enzymes can be detected (in the presence or absence of a potential 
inhibitor) by measuring either the formation of a product or the consumption of an intermediate, 
preferably by TLC. HPLC or NMR. 

With the finding that 1-hydroxy-2-methyl-2-butenyl 4-diphosphate is an intermediate of the non- 
mevalonate terpenoid pathway we have aquired essential determinants of the structure of 
inhibitors. Namely, the structures of a subset of inhibitors should be similar to at least a portion 
of the starting compound or the product or the transition state between the starting compound 
e.g. 2C-methyl-D-erythritol 2,4-cyclodiphosphate and the product e. g. 1-hydroxy-2-methyl-2- 
butenyl 4-diphosphate. 

This invention discloses novel compounds, or salts thereof, of the following formula I: 




(I) 



whereby and are different from each other and one of R^ and R^ is hydrogen and the 
other is selected from the group consisting of -CH2-0-PO(OH)-0-PO(OH)2, -CH2-0-PO(OH)2. 
and -CH2OH, and whereby A stands for -CHjOH or -CHO. These compounds may be 
isotope-labelled. 

In formula I, A preferably stands for -CHjOH. 

Among R^ and R^, R^ is preferably hydrogen and R^ is preferably selected from the group 
consisting of .CH2-0-PO(OH)-0-PO(OH)2 and -CH2-0-PO(OH)2. 

In the group consisting of -CH2-0-PO(OH)-0-PO(OH)2 and -CH2-0-PO(OH)2. -CH2-0-P0(0H). 
0-P0(0H)2 is preferred. 

If a compound of formula I is a salt, it may e.g. be a lithium, sodium, potassium, magnesium, 
ammonium, manganese salt. These salts may derive from a single or from multiple 
deprotonations from the (di)phosphoric acid moiety. 

The novel compounds disclosed herein are useful for various applications e.g. for screening for 
genes, enzymes or inhibitors of the biosynthesis of isoprenoids or terpenoids, either /n vitro in 
the presence of an electron donor or in vivo. 

This invention further provides a process for the chemical preparation of a compound of 
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formula I or a salt thereof: 



H3C A 



X. 



(I) 



R2^ R1 



wherein A represents -CHgOH and and are different from each other and one of R^ and 
R2 is hydrogen and the other is -CH2-0-PO(OH)-0-PO(OH)2. -CH2-0-PO(OH)2 or -CH2-OH by 
the following steps: 

(a) converting a compound of the following formula (II): 



O-B 



O 



(II) 



wherein B is a protective group into a compound of the following formula (III) or (IV): 




O-B 




O-B 



(III) 



(IV) 



by a Wittig or Horner reagent, wherein the group D is a precursor group convertible 
reductively to a -CH2-OH group; 

(b) reductively converting group D to a -CH2-OH group; 

(c) optionally converting group -CHj-OH obtained in step (b) into -CH2-0-P0(0H)-0- 
PO(OH)2 or -CH2-0-PO(OH)2 or salts thereof in a manner knwon per se; 

(d) optionally conversion to a desired salt; 

(e) removing the protective group B. 



In the above process, said protective group B may be any group that allows to regenerate an 
hydroxy group at the position it is attached to. Said protective group B is preferably stable 
under the conditions of step (a) to step (d). Said protective group B is removed In step (e) of 
said process in order to generate a hydroxy group. Protective groups for hydroxy groups are 
known to the skilled person. Group B may for example form an acetal group together with the 
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remaining moiety of the compound of formula (II), (III) or (IV). Acetals can be hydrolysed under 
acidic conditions. Most preferably, group B is a 2-tetrahydropyranyl group. 

In the above process, said group D is a precursor group convertible reductively to a -CHj-OH 
group. Group D may be a derivative of a carbon acid. Examples of such a group include 
alkoxycarbonyl and aminocarbonyl groups. Said aminocarbonyl groups may be substitued at 
the amino group with one or two alkyi groups. It is most preferred to use alkoxycarbonyl 
groups. The alkyI group of said alkoxycarbonyl groups or said alkyI groups of said 
aminocarbonyl groups may be a linear or branched alkyI groups which may be singly or 
multiply substituted. Preferred are C^-Cg alkyI groups like methyl, ethyl, propyl, butyl, pentyl or 
hexyl groups. Most preferred are methyl or ethyl groups. The most preferred example of said 
group D is an ethoxycarbonyl group. 

Said compound of formula (II) may be prepared by protecting the hydroxy group of hydroxy 
acetone with said group B. If group B is a tetrahydropyrany! group, the compound of formula 
(11) may be prepared from hydroxy acetone and 3,4-dihydro-2H-pyran, preferably employing 
pyridinium toluene-4-sulfonate as a catalyst. A specific method for preparing acetonyl 
tetrahydropyranyl ether is described in example 24. 

In step (a) of said process, the compound of formula (II) is converted to a compound of 
formula (III) or (IV) by a Wittig or a Horner reagent. Wittig-type reactions and reagents are 
known to skilled persons (see e.g. Watanabe et ai 1996 and references cited therein). 
Common Wittig reagents to be used for the above process are methylen- 
triphenylphosphoranes which may be substituted at the methylene group. For the above 
process of this invention, a methylen-triphenylphosphorane is employed which is substituted 
with the above-defined group D at the methylene group. Such Wittig reagents are 
commercially available or can be prepared according to known methods. 
The olefin produced in step (a) may be formed as a mixture of the cis/trans isomers of 
formulas (III) and (IV). If one of said isomers is preferred, it may be enriched or separated from 
the other isomer by methods known in the art. preferably by chromatography. Alternatively, a 
separation of said isomers may be carried out after one of the following steps (b) to (e). 

In step (b) of the above process, group D of the compound of formula (III) or (IV) or a mixture 
of said compounds is reductively converted to a -CHj-OH group. Various methods are known 
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in the art to perform such a reduction. Conditions are chosen such that group D is reduced 
whereas the olefin moiety is not. Examples for reductants to be used In this step are molecular 
hydrogen or metal hydrides. Examples for useful metal hydrides include boron hydrides like 
sodium borohydride, aluminium hydrides jike lithium aluminium hydride or diisobutyl 
aluminiumhydride (DIBAH), alkali metal or metal earth hydrides like sodium hydride or calcium 
hydride. Aluminium hydrides are preferred. A specific example for carrying out step (b) is 
described in example 24. 

If the desired end product of said process is a compound of formula (I), wherein or is 
-CH2-OH, the compound or mixture of compounds obtained in step (b) may be directly 
subjected to step (d) or step (e). Preferably, it is subjected to step (e) for removing protective 
group B. If the desired end product of said process is a compound of formula (I), wherein R^ 
or R^ is -CH2-0-PO(OH)-0-PO(OH)2 or -CH2-0-PO(OH)2, compound or mixture of compounds 
obtained in step (b) is subjected to step (c) of said process for converting -CH2-OH group 
obtained in step (b) into a -CH2-0-PO(OH)-0-PO(OH)2 or a -CH2-0-PO(OH)2 group. 

Step (c) may be carried in several ways which are known to the skilled person. Step (c) may 
comprise substituting the hydroxy group of said -CH2-OH group obtained in step (b) by a 
leaving group. Step (c) may comprise converting said -CH2-OH group to a -CH2-halide group 
by a halogenating agent. A sulfuric, sulfonic or phosphoric acid halogenide may be employed 
as halogenating agent. Tosyl chloride is most preferred. Said halide may be fluoride, chloride, 
bromide or iodide, preferably chloride. The compound carrying said -CH2-halide group is 
preferably isolated. Said leaving group may further be created by reacting said -CH2-OH group 
obtained in step (b) with a sulfonic acid halide, preferably tosyl chloride. 
Said intermediate having said leaving group may then be reacted with phosphoric or 
diphosphoric acid or singly or multiply deprotonated forms thereof. Preferably an 
alkylammonium salt of phosphoric or diphosphoric acid is used, more preferably a 
tetraalkylammonium salt, and most preferably a tetra-butylammonium salt. A polar aprotic 
solvent is preferred for this reaction. Preferably, the compound or mixture of compounds 
obtained is purified according to standard procedures. A specific example for carrying out step 
(c) is described in example 24. 

In step (d), the compound or mixture of compounds obtained in step (c) may be converted to 
a desired salt. Methods for carrying out step (d) are well known. Such methods may comprise 
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adjusting the pH of an aqueous solution with an appropriate acid or salt to a desired pH value. 

In step (e), the protective group B of a compound obtained in one of steps (b) to (d) is rennoved 
In order to obtain a compound of formula (I) wherein A is -CHj-OH. The method for removing 
a protective group depends on the type of the protective group. Such methods are well known. 
If the protective groups forms an acetal, removing said protecting group may be achieved by 
acid hydrolysis (see example 24). 

This invention provides protein in a form that is functional for the enzymatic conversion of 2C- 
methyUD-erythritol 2,4-cyclodiphosphate to 1-hydroxy-2-methyl-2-butenyl 4-diphosphate 
notably in its (E)-form, preferably in the presence of NADH and/or NADPH and/or in the 
presence of Co^*. Said enzyme preferably has a sequence encoded by the ispG (formerly 
gcpE) gene of E. coli or a function-conservative homologue of said sequence, i.e. said 
homologue is capable of performing the same function as said protein. For many applications 
of said protein, it may be expressed and purified as a fusion protein, notably a fusion with 
maltose binding protein. In this way, enzymatically active protein may be readily obtained. 

This Invention further provides a protein in a form that is functional for the enzymatic 
conversion of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate, notably in its (E)-form, to 
isopentenyl diphosphate and/or dimethylallyl diphosphate. Said protein preferably requires 
FAD and NAD(P)H for said functionality. Further, said protein may require a metal ion selected 
from the group of manganese, iron, cobalt, or nickel ion. Said protein preferably has a 
sequence encoded by the ispH (formerly lytB) gene of E. coli or a function-conservative 
homologue of said sequence. For many applications of said protein, it may be expressed and 
purified as a fusion protein, notably a fusion with maltose binding protein. In this way, 
enzymatically active protein may be readily obtained. 

The above proteins may be plant proteins, notably from Arabidopsis thaliana, bacterial 
proteins, notably from £. co//, or protozoal proteins, notably from Plasmodium falciparum. 

The invention further provides a purified isolated nucleic acid encoding one or both of the 
above proteins with or without introns. Further, the invention provides a DNA expression vector 
containing the sequence of said purified isolated nucleic acid. 
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The invention further provides cells, cell cultures, organisms or parts thereof recombinantly 
endowed with the sequence of said purified isolated nucleic acid or with said DNA expression 
vector, wherein said cell is selected from the group consisting of bacterial, protozoal, fungal, 
plant, insect and mammalian cells. Said cells, cell cultures, organisms or parts thereof may 
further be endowed with at least one gene selected from the following group: dxs, dxn ispD 
(formerly ygbP): ispE (formerly ychB); ispF (formerly ygbB) of coli or a function-conservative 
homologue thereof, or a function-conservative fusion, deletion or insertion variant of any of the 
above genes. 

The invention further provides cells, cell cultures, or organisms or parts thereof transformed or 
transfected for an increased rate of formation of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate. 
notably in its (£)-form, compared to cells, cell cultures, or organisms or parts thereof absent 
said transformation or transfection. The transformation or transfection preferably comprises 
endowment with the gcpE gene of E. coli or with a function-conservative homologue from an 
other organism, e.g. plant or protozoal organism. 

The invention also provides cells, cell cultures, or organisms or parts thereof transformed or 
transfected for an increased rate of conversion of (E)-1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate to isopentenyl diphosphate and/or dimethylallyl diphosphate compared to cells, 
cell cultures, or organisms or parts thereof absent said transformation or transfection. The 
transformation or transfection preferably comprises endowment with the lytB gene of E. coli or 
with a function-conservative homologue from an other organism, e.g. plant or protozoal 
organism. 

The invention provides also cells, cell cultures, or organisms or parts thereof transformed or 
transfected for an increased expression level of the protein of one of claims 1 to 4 and/or the 
protein of one of claims 5 to 8 compared to cells, cell cultures, or organisms or parts thereof 
absent said transformation or transfection. 

Moreover, the invention provides a method of altering the expression level of the gene 
product(s) of ispG and/or /spH or function-consen/ative homolgues from other organisms or 
variants thereof in cells comprising 

(a) transforming host cells with the ispG and/or ispH gene, 

(b) growing the transformed host cells of step (a) under conditions that are suitable for the 
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efficient expression of ispG and/or ispH, resulting in production of altered levels of the 
ispG and/or ispH gene product(s) in the transformed cells relative to expression levels 
of untransformed cells. 

Furthermore, the invention provides a method of identifying an inhibitior of an enzyme 
functional for the conversion of 2C-methyl-D-erythritol 2,4-cyclodiphosphate to 1-hydroxy-2- 
methyl-2-butenyl 4-diphosphate, notably its E-form, of the non-mevalonate isoprenoid pathway 
by the following steps: 

(a) incubating a mixture containing said enzyme with its, optionally isotope-labeled, 
substrate 2C-methyl-D-erythritol-2.4-cyclodiphosphate under conditions suitable for 
said conversion in the presence and in the absence of a potential inhibitor, 

(b) subsequently determining the concentration of 2C-methyl-D-erythritol 2.4- 
cyclodiphosphate and/or 1-hydroxy-2-methyl-2-butenyl 4-diphosphate. and 

(c) comparing the concentration in the presence and in the absence of said potential 
inhibitor. 

Furthermore, the invention provides a method of identifying an inhibitior of an enzyme 
functional for the conversion of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate, notably its E- 
form, to isopentenyl diphosphate or dimethylallyl diphosphate of the non-mevalonate 
isoprenoid pathway by the following steps: 

(a) incubating a mixture containing said enzyme with its. optionally isotope-labeled, 
substrate 1-hydroxy-2-methyl-2-butenyl 4-diphosphate under conditions suitable for 
said conversion in the presence and in the absence of a potential inhibitor, whereby 
said mixture preferably contains FAD, 

(b) determining the concentration of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate and/or 
isopentenyl diphosphate or dimethylallyl diphosphate, and 

(c) comparing the concentration in the presence and in the absence of said potential 
inhibitor. 

The above methods of identifying an inhibitior are preferably carried out by following the 
consumption of NADPH or NADH making use of its characteristic absorbance spectrum. 
Alternatively, the fluorescence of NADH or NADPH can be followed when excited around 340 
nm. The above methods of identifying an inhibitior may advantageously be performed as high- 
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throughput screening assays for inhibitors, notably in combination with photometric detection 
of the consumption of NADH or NADPH. Further, one or more flavin analogues (e.g. FAD, 
FMN) may be added to the incubation mixtures in said methods, preferably in catalytic 
amounts. Most preferred is the addition of FAD. Said enzymes may be employed in said 
methods as fusion proteins with maltose binding protein (examples 38 to 41 , 44, 45). which 
allows straightfoHA^ard expression and purification of said enzymes in enzymatically active 
form. Further embodiments of said methods of identifying are defined in the subclaims to these 
methods. 

It is known that intermediates of the non-mevalonate pathway are responsible for yS T cell 
activation by various pathogenic bacteria. yS T cell activation is followed by T cell proliferation, 
secretion of cytokines and chemokines and is very likely crucial for regulating the immune 
response following pathogen infection (Altincicek et aL, 2001 and references cited therein). 
Recently, it was shown that E. co// strains lost the ability to stimulate 78 T cells when the dxr or 
the gcpE gene was knocked out, strongly indicating that an intermediate downstream of gcpE 
and upstream of isopentenyl pyrophosphate exhibits the most potent antigenic activity 
(Altincicek ef a/.. 2001). However, the intermediate produced by the gcpE gene product in the 
pathway has been unknown. Herein, this intermediate has surprisingly been identified as an 
hitherto unprecedented compound, which opens up a whole range of novel applications for this 
compound. 

The compounds of formula I can be used as immunomodulatory or immunostimulating agents, 
e.g. for activating y5 T cells. Immunomodulation via yS T cell activation by said compounds may 
prove useful not only to support combat against pathogens but for various conditions for which 
a stimulation of the immune system is desirable. The novel compounds of the invention may 
therefore be used for medical treatment of pathogen infections. Such a treatment stimulates 
the activity of the immune system against the pathogen. Preferably, the compound wherein 
R^=H and/or A is -CHjOH is used for this application. Alternatively, the oxidation product with 
A = CHO may prove to be highly active. Among the compounds of formula I. the one with the 
highest or most suitable y5 T cell stimulating activity may be selected in a test system known 
in the art (e.g. that described by Altincicek et aL, 2001). Importantly, since the compounds of 
the invention do not act as antibiotics, development of resistancies is not a problem for the 
method of treatment disclosed herein. 
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In an advantageous embodiment, said compounds may be combined with an antibiotically 
active compound for treating a pathogen infection. Such a treatment combines the advantages 
of inhibiting pathogen proliferation by an antibiotic and stimulating the immune system against 
the pathogen resulting in a much faster and more efficient treatment. Such an antibiotically 
active compound may be a bacteriostatic antibiotic (e.g. tetracyclines). 

Therefore, the novel compounds of this invention may be used for the preparation of a 
medicament. The invention further pertains to a pharmaceutical composition containing a 
compound of formula I and a pharmaceutically acceptable carrier. Said pharmaceutical 
composition may further contain an antibiotically active compound as mentioned above. 

This invention further comprises antibodies against the compounds of formula I. Said 
antibodies may be polyclonal or monoclonal and may be raised according to conventional 
techniques. Raising such antibodies will comprise coupling of a compound of formula I has 
hapten to a macromolecular carrier like a protein in order to be immunogenic. Such an 
immunogenic compound of formula I may further be used as a vaccine. 
The antibodies of the invention may be used for detecting a compound of formula I. Since said 
compounds are produced by organisms having the non-mevalonate isoprenoid pathway, such 
organisms may be detected using said antibodies. Preferably, such organisms may be 
detected in body fluids in a diagnostic method, thereby indicating an infection by a pathogen 
having the non-mevalonate pathway. A positve result in such a diagnostic method may at the 
same time indicate possible treatment by the compounds of the invention. 
When an antibody of the invention' is used for detecting a compound of formula I, it is 
preferably labelled to allow photometric detection and/or immobilized to a support. Such 
methods are well-known in the art. 

This invention further provides a process for the chemical preparation of a compound of 
formula I or a salt thereof (see Fig. 7): 




(I) 



wherein A represents -CHjOH or -CHO, is hydrogen, and R^ is -CH2-0-P0(0H)-0- 
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PO(OH)2. -CH2-0-PO(OH)2 or -CHj-OH by the following steps: 

(a) converting 2-methyl-2-vinyl-oxiran into 4-chloro-2-methyl-2-buten-1-al; 

(b) converting 4-chloro-2-methyl-2-buten-1-al to its acetal; 

(c) substituting the chlorine atom in the product of step (b) by a hydroxyl group, a phosphate 
group or a pyrophosphate group; 

(d) hydrolysing the acetal obtained in step (c) to produce an aldehyde group; 

(e) optionally converting the aldehyde group of the product of step (d) to a -CHgOH group. 

Preferred embodiments of this process are defined in the subclaims and are exemplified In 
example 42. 

The invention will now be described in detail with reference to specific examples. 
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Example 1 

Construction of a vector carrying the xylB gene of Escherichia coli capable for transcription 
and expression of D-xylulokinase 

Chromosomal DNA from Escherichia coli strain XL1-Blue (Bullock et al. 1987; commercial 
source: Stratagene, LaJolla, CA, USA) is isolated according to a method described by Meade 
etal. 1982. 

The E. coli ORF xylB (accession no. gb AE000433) from base pair (bp) position 8596 to 10144 
is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
CCGTCGGAATTCGAGGAGAAATTAACCATGTATATCGGGATAGATCTTGG-3', 10 pmol of 
the primer 5'-GCAGTGAAGCTTTTACGCCATTAATGGCAGAAGTTGC-3'. 20 ng of 
chromosomal DNA. 2 U of Taq DNA polymerase (Eurogentec. Seraing, Belgium) and 20 nmol 
of dNTPs in a total volume of 100 |jl containing 1.5 mM MgClj. 50 mM KCI, 10 mM Tris- 
hydrochloride. pH 8.8 and 0.1 % (w/w) Triton X-100. 

The mixture is denaturated for 3 min at 94 ''C. Then 30 PCR cycles for 60 sec at 94 **C, 60 sec 
at 50 ""C and 75 sec at 72 *C followed. After further incubation for 10 min at 72 "C. the mixture 
is cooled to 4 ''C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PCR amplificate is purified with the PCR purification kit from Qiagen (Hilden. Germany). 

1.0 pg of the vector pBluescript SKII" (Stratagene) and 0.5 pg of the purified PCR product are 
digested with EcoRI and Hind\\\ in order to produce DNA fragments with overiapping ends. The 
restriction mixtures are prepared according to the conditions supplied by the customer (New 
England Biolabs, Frankfurt am Main, Germany (NEB)) and are incubated for 3 h at 37 °C. 
Digested vector DNA and PCR product are purified using the PCR purification kit from Qiagen. 

20 ng of the purified vector DNA and 20 ng of the purified PCR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBSxylB. The ligation mixture is incubated for 2 h at 25 ""C. 1 pi of the 
ligation mixture is transformed into electrocompetent E. coli XL1-Blue cells according to a 
method described by Dower et aL, 1988. The plasmid pBSxylB is isolated with the plasmid 
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isolation kit from Qiagen. 

Tlie DNA insert of the plasmid pBSxylB is sequenced by the automated dideoxynucleotide 
method (Sanger et ai, 1992) using an ABI Prism 377™ DNA sequencer from Perkin Elmer 
(Nonwalk, USA) with the ABI Prism™ Sequencing Analysis Software from Applied Biosystems 
Divisions (Foster city, USA). It is identical with the DNA sequence of the database entry (gb 
AE000433). 

Example 2 

Construction of a vector carrying the xy/B and dxr genes of Escherichia coli capable for 
transcription and expression of D-xylulokinase and DXP reductoisomerase 

The E. coli ORF dxr (accession no. gb AE000126) from base pair (bp) position 9887 to 11083 
is amplified by PGR using chromosomal E, coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
CTAGCCAAGCTTGAGGAGAAATTAACCATGAAGCAACTCACCATTCTGG-3', 10 pmol of the 
primer 5-GGAGATGTCGACTCAGCTTGCGAGACGC-3\ 20 ng of chromosomal DNA, 2 U of 
Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs In a total volume of 100 pi 
containing 1.5 mM MgClj, 50 mM KCI, 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 

The mixture is denaturated for 3 min at 94 ''C. Then 30 PGR cycles for 60 sec at 94 **G, 60 sec 
at 50 ^'G and 75 sec at 72 **G followed. After further incubation for 10 min at 72 X. the mixture 
is cooled to 4 ""C. An aliquot of 2 pi Is subjected to agarose gel electrophoresis. 

The PGR ampllficate is purified with the PGR purification kit from Qiagen (Hilden). 

1.2 pg of the vector pBSxylB (Example 1) and 0.6 pg of the purified PGR product are digested 
with H/ncflll and Sa/I in order to produce DNA fragments with overlapping ends. The restriction 
mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 **G. Digested vector DNA and PGR product are purified using the PGR 
purification kit from Qiagen. 
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20 ng of the purified vector DNA and 18 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco). 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBSxylBdxr. The ligation mixture is incubated for 2 h at 25 "C. 1 pi of the 
ligation mixture is transformed into electrocompetent E. coli XLI-Blue cells. The plasmid 
pBSxylBdxr is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBSxylBdxr is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377^^ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb AE000126). 

The DNA sequence of the vector construct pBSxylBdxr is shown in Annex A. 
Examole 3 

Construction of a vector carrying the xylB, dxr and ispD genes of Escherichia coli capable for 
transcription and expression of D-xylulokinase, DXP reductoisomerase and CDP-ME synthase 

The E. coli ORF ispD (accession no. gb AE000358) from base pair (bp) position 6754 to 7464 
is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
CCGGGAGTCGACGAGGAGAAATTAACCATGGCAACCACTCATTTGGATG-3', 10 pmol of 
the primer 5'-GTCCAACTCGAGTTATGTATTCTCCTTGATGG-3', 20 ng of chromosomal DNA, 
2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 pi 
containing 1.5 mM MgClz.. 50 mM KCI, 10 mM Tris-hydrochloride. pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 

The mixture is denaturated for 3 min at 94 **C. Then 30 PGR cycles for 30 sec at 94 **G, 30 sec 
at 50 ^'G and 45 sec at 72 **C followed. After further incubation for 10 min at 72 ''G, the mixture 
is cooled to 4 *G. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

1.5 pg of the vector pBSxylBdxr (Example 2) and 0.8 pg of the purified PGR product are 
digested with Sail and Xho\ in order to produce DNA fragments with overlapping ends. The 
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restriction mixtures are prepared according to the conditions supplied by the customer (NEB) 
and are Incubated for 3 h at 37 ""C. Digested vector DNA and PGR product are purified using 
the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 12 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco). 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBSxylBdxrispD. The ligation mixture is incubated for 2 h at 25 ''C. 1 pi of 
the ligation mixture is transformed into electrocompetent E. co// XLI-Blue cells. The plasmid 
pBSxylBdxrispD is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBSxylBdxrispD is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377^^ DNA sequencer from Perkin Elmer with 
the ABI Prism^" Sequencing Analysis Software from Applied Biosystems Divisions. It is 
identical with the DNA sequence of the database entry (gb AE000126). 
The DNA sequence of the vector construct pBSxylBdxrispD is shown in Annex B. 

Example 4 

Construction of a vector carrying the xy/B, dxr. ispD and /spF genes of Escherichia coli capable 
for transcription and expression of D-xylulokinase, DXP reductoisomerase. GDP-ME synthase, 
and cMEPP synthase 

The £. coli ORPs ispD and ispF (accession no. gb AE000358) from base pair (bp) position 
6275 to 7464 is amplified by PGR using chromosomal £. coli DNA as template. The reaction 
mixture contains 10 pmol of the primer 5'- 
GGGGGAGTGGAGGAGGAGAAATTAAGGATGGGAAGGAGTGATTTGGATG-3\ 10 pmol of 
the primer 5'-TATGAAGTGGAGTGA I I I I GTTGCCTTAATGAG-3'. 20 ng of chromosomal 
DNA, 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 
100 Ml containing 1.5 mM MgGlj. 50 mM KGI. 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % 
(w/w) Triton X-100. 

The mixture is denaturated for 3 min at 94 •'G. Then 30 PGR cycles for 60 sec at 94 **C, 60 sec 
at 50 X and 75 sec at 72 ''G followed. After further incubation for 10 min at 72 ""G, the mixture 
is cooled to 4 ^'G. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
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The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden), 

1.4 pg of the vector pBSxylBdxr (Example 2) and 0.7 pg of the purified PGR product are 
digested with Sa/I and Xho\ in order to produce DNA fragments with overiapping ends. The 
restriction mixtures are prepared according to the conditions supplied by the customer (NEB) 
and are incubated for 3 h at 37 ^'G. Digested vector DNA and PGR product are purified using 
the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 1 8 ng of the purified PGR product are llgated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBSxylBdxrispDF. The ligation mixture is Incubated for 2 h at 25 ""G. 1 pi 
of the ligation mixture is transformed into electrocompetent E. co// XL1-Blue cells. The plasmid 
pBSxylBdxrispDF is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBSxylBdxrispDF is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377^^ DNA sequencer from Perkin Elmer with 
the ABI Prism"^*^ Sequencing Analysis Software from Applied Biosystems Divisions. It is 
identical with the DNA sequence of the database entry (gb AE000126). 

Example 5 

Gonstruction of a vector carrying the xy/B, dxr, /spD, ispE and ispF genes of Escherichia coli 
capable for transcription and expression of D-xylulokinase, DXP reductoisomerase. GDP-ME 
synthase. GDP-ME kinase and cMEPP synthase 

The E. CO// ORF ispE (accession no. gb AE000219) from base pair (bp) position 5720 to 6571 
is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
GGGAAGGTGGAGGAGGAGAAATTAAGGATGGGGAGAGAGTGGGGG-3\ 10 pmol of the 
primer 5 -GGTGAGGGTAGGTTAAAGCATGGGTGTGTGG-3'. 20 ng of chromosomal DNA. 2 
U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 pi 
containing 1.5 mM MgGlj, 50 mM KGI, 10 mM Tris-hydrochloride. pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 
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The mixture is denaturated for 3 min at 94 ''C. Then 30 PGR cycles for 45 sec at 94 *C, 45 sec 
at 50 *C and 60 sec at 72 *C followed. After further incubation for 10 min at 72 "^C, the mixture 
is cooled to 4 ""C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

1.2 pg of the vector pBSxylBdxrispDF (Example 4) and 0.6 pg of the purified PGR product are 
digested with Xho\ and Kpn\ in order to produce DNA fragments with overlapping ends. The 
restriction mixtures are prepared according to the conditions supplied by the customer (NEB) 
and are incubated for 3 h at 37 ""G. Digested vector DNA and PGR product are purified using 
the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 15 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBScyclo. The ligation mixture is incubated for 2 h at 25 ^'G. 1 pi of the 
ligation mixture is transformed into electrocompetent E, coli XLI-Blue cells. The plasmid 
pBScyclo is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBScyclo is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377™ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb AE000219). The DNA sequence of the vector construct 
pBScyclo is shown in Annex G. 

Example 6 

Gonstruction of a vector carrying the gcpE gene of Escherichia coli capable for its transcription 
and expression 

The £. coli ORF gcpE (accession no. gb AE000338) from base pair (bp) position 372 to 1204 
is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
GGTAGGGGATGGGAGGAGAAATTAAGGATGGATAAGGAGGGTGGAATTG-3\ 10 pmol of the 
primer 5'-GGGATGGTGGAGTTATTTTTGAAGGTGCTGAAGGTC-3'. 20 ng of chromosomal 
DNA. 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 
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100 pi containing 1.5 mM MgClj, 50 m[\/l KCI. 10 mM Tris-hydrocliloride. pH 8.8 and 0.1 % 
(w/w) Triton X-1 00. 

The mixture is denaturated for 3 min at 94 'C. Then 30 PGR cycles for 60 sec at 94 ^'C, 60 sec 
at 50 X and 90 sec at 72 **C followed. After further incubation for 10 min at 72 **C, the mixture 
is cooled to 4 ^'C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

2.0 pg of the vector pAGYG184 (Ghang and Gohen 1978. NEB) and 0.7 pg of the purified PGR 
product are digested with BamH\ and Sa/I in order to produce DNA fragments with overiapping 
ends. The restriction mixtures are prepared according to the conditions supplied by the 
customer (NEB) and are incubated for 3 h at 37 ^'G. Digested vector DNA and PGR product are 
purified using the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 20 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ugase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pACYCgcpE. The ligation mixture is incubated for 2 h at 25 °G. 1 pi of the 
ligation mixture is transformed into electrocompetent E. co// XL1-Blue cells. The plasmid 
pAGYGgcpE is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pAGYCgcpE is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377^^ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb AE000338). 

The DNA sequence of the vector construct pAGYGgcpE is shown in Annex D. 
Example 7 

Gonstruction of vectors carrying a carotenoid operon from Erwinia uredovora capable for the 
in vivo production of (3-carotene 

The open reading frames crtY, crti and crtB of a carotenoid operon from Erwinia uredovora 
(accession no. gb D90087) from base pair (bp) position 2372 to 6005 is amplified by PGR 
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using chromosomal E. uredovora DNA as template. The reaction mixture contains 1 0 pmol of 
the primer 5'-CATTGAGAAGCTTATGTGCACCG-3'. 10 pmol of the primer 5'- 
CTCCGGGGTCGACATGGCGC-3'. 40 ng of chromosomal DNA of E. uredovora, 8 U of Taq 
DNA polymerase (Eurogentec), 20 nmol of dNTPs. Taq Extender (Stratagene) in a total 
volume of 100 pi 1x Taq Extender buffer (Stratagene). 

The mixture is denaturated for 3 min at 94 ^'C. Then 40 PGR cycles for 60 sec at 94 ^'C, 60 sec 
at 50 *C and 300 sec at 72 *C followed. After further incubation for 20 min at 72 ''C. the 
mixture is cooled to 4 **C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden, Germany). 

1.0 pg of the vector pBluescript SKI I" (Stratagene) and 2.0 pg of the purified PGR product are 
digested with HindWl and Sa/I in order to produce DNA fragments with overlapping ends. The 
restriction mixtures are prepared according to the conditions supplied by the customer (NEB) 
and are incubated for 3 h at 37 °C. Digested vector DNA and PGR product are purified using 
the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 40 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBScaro34. The ligation mixture is incubated for 2 h at 25 'G. 1 pi of the 
ligation mixture is transformed into electrocompetent E. coli XL1-Blue cells. The plasmid 
pBScaro34 is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBScaro34 is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377™ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb D90087). 

The E. uredovora ORF crtE (accession no. gb D90087) from base pair (bp) position 175 to 
1148 is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5-GGGGATGTTTGGAATTGGGG-3'. 10 pmol of the primer 5- 
ATGGAGGAAGGTTAAGTGAGGGG-3\ 20 ng of chromosomal DNA, 2 U of Taq DNA 
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polymerase (Eurogentec, Seraing, Belgium) and 20 nmol of dNTPs in a total volume of 100 pi 
containing 1.5 mM MgClj, 50 mM KCI, 10 mM Tris-hydrochloride. pH 8.8 and 0.1 % (w/w) 
Triton X-100. 

The mixture is denaturated for 3 min at 94 **C. Then 30 PGR cycles for 45 sec at 94 "^C, 45 sec 
at 50 and 60 sec at 72 followed. After further incubation for 1 0 min at 72 ''C, the mixture 
is cooled to 4 ""C. An aliquot of 2 |jl is subjected to agarose gel electrophoresis. 
The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden, Germany). 

1.5 |jg of the vector pBScaro34 (see above) is digested with EcoRI and HindlW and 0.6 pg of 
the purified PGR product are digested with Mfe\ and HindlW in order to produce DNA fragments 
with overlapping ends. The restriction mixtures are prepared according to the conditions 
supplied by the customer (NEB) and are incubated for 3 h at 37 ''G. Digested vector DNA and 
PGR product are purified using the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 16 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco). 2 (jI of T4-Ligase buffer (Gibco) in a total volume of 10 |jl, 
yielding the plasmid pBScaro14. The ligation mixture is incubated for 2 h at 25 ''G. 1 |jl of the 
ligation mixture is transformed into electrocompetent E. co// XLI-Blue cells. The plasmid 
pBScaro14 is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBScaro14 is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377"^^ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb D90087). The DNA sequence of the plasmid pBScaro14 
is shown in Annex E. 

5 |jg of the vector pBScaro14 (see above) is digested with SamHI and Sail, The restriction 
mixture is prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 **G. The restriction mixture is separated on a agarose gel and the 
fragments of 2237 and 2341 bp size are purified with the gel extraction kit from Qiagen. 

3 |jg of the vector pAGYG184 (see above) is digested with SamHI and Sail The restriction 
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mixture is prepared according to the conditions supplied by tlie customer (NEB) and are 
incubated for 3 h at 37 ''C. The restriction mixture Is separated on a agarose gel and the 
fragment of 3968 bp size is purified with the gel extraction kit from Qiagen. 

30 ng of the purified vector DNA and each 25 ng of the purified 2237 and 2341 bp fragments 
are ligated together with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total 
volume of 10 pi. yielding the plasmid pACYCcaro14. The ligation mixture is incubated for 2 h 
at 25 •^C. 1 Ml of the ligation mixture is transformed into electrocompetent E. coli XLI-Blue 
cells. The plasmid pACYCcaro14 is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pACYCcaro14 is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377*™ DNA sequencer from Perkin Elmer with 
the ABI Prism™ Sequencing Analysis Software from Applied Biosystems Divisions. It is 
identical with the DNA sequence of the database entry (gb D90087). The DNA sequence of the 
plasmid pACYCcaro14 is shown in Annex F. 

Example 8 

Enzymatic preparation of lU-^^CgJI-deoxy-D-xylulose 5-phosphate 

A reaction mixture containing 960 mg of [U-^^Cglglucose {5.1 mmol), 6.1 g of ATP (10.2 mmol), 
337 mg of thiamine pyrophosphate, 1.14 g of [2,3-^^C2]pyruvate (10.2 mmol), 10 mM MgCl2. 5 
mM dithiothreitol in 150 mM Tris hydrochloride, pH 8.0 is prepared. 410 Units of triose 
phosphate isomerase (from rabbit muscle, Type lll-S, E. C. 5.3.1.1.. Sigma), 100 U 
hexokinase (from Bakers Yeast, Type VI, E. C. 2.7.1.1, Sigma), 100 U phosphoglucose 
Isomerase (from Bakers Yeast, Type III, E. C. 5.3.1.9. Sigma), 100 U phosphofructokinase 
(from Bacillus stearothermophilus, Type VII, E. C. 2.7 A A 1 , Sigma), 50 U aldolase (from rabbit 
muscle, E. C. 4.1.2.13, Sigma) and 12 U of recombinant DXP synthase from B. subtilis are 
added to a final volume of 315 ml. The reaction mixture is incubated at 37 overnight and 
during incubation the pH is hold at a constant value of 8.0. The reaction is monitored by ^^C 
NMR spectroscopy. 

Example 9 

Enzymatic preparation of [3,4,5-^^C3]1-deoxy-D-xylulose 5-phosphate 
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A solution containing 150 mM Tris hydrochloride, 10 mM MgClj. 1.0 g of [U-^^Cglglucose (5.4 
mmol), 0.23 g (1.5 mmol) of dithiothreitol, 0.3 g (0.7 mmol) of thiamine pyrophosphate, 0.1 g 
(0.2 mmol) of ATP (disodium salt), and 2.2 g (11 mmol) of phosphoenol pyruvate (potassium 
salt) is adjusted to pH 8.0 by the addition of 8 M sodium hydroxide. 403 U (2.8 mg) of pyruvate 
kinase (from rabbit muscle, E. C. 2.7.1.40), 410 Units of triose phosphate isomerase (from 
rabbit muscle. Type lll-S, E. C. 5.3.1.1., Sigma), 100 U hexokinase (from Bakers Yeast, Type 
VI, E. C. 2.7.1.1, Sigma), 100 U phosphoglucose isomerase (from Bakers Yeast, Type III, E. 
C. 5.3.1.9, Sigma), 100 U phosphofructokinase (from Bacillus stearothermophilus. Type VII, E. 
C. 2.7.1.11. Sigma), 50 U aldolase (from rabbit muscle, E. C. 4.1.2.13, Sigma) and 12 U 
recombinant DXP synthase from B, subtilis are added to a final volume of 300 ml. The reaction 
mixture is incubated at 37 **C for overnight. 

Example 10 

Enzymatic preparation of 1-deoxy-D-xylulose 

The pH value of the reaction mixture obtained in example 8 or 9 is adjusted to 9.5. Magnesium 
chloride is added to a concentration of 30 mM. 50 mg (950 Units) of alkaline phosphatase from 
bovine intestinal mucosa (Sigma, E. C. 3.1.3.1) are added and the reaction mixture is 
incubated for 16 h. The conversion is monitored by ^^C-NMR spectroscopy. The pH is adjusted 
to a value of 7.0 and the solution is centrifuged at 14,000 upm for 5 minutes. Starting from 
labelled glucose (examples 8 or 9) the overall yield of 1-deoxy-D-xylulose is approximately 50 
%. 

The supernatant or the lyophilised supernatant is used in incorporation experiments (see 
examples 11 to 17). 

Examole 1 1 

Incorporation experiment with recombinant Escherichia co// XL 1-pBSxy IB using [3,4.5-^^C3l1- 
deoxy-D-xylulose 

0.2 litre of Luria Bertani (LB) medium containing 36 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of E. co// strain XLI-Blue harbouring the plasmid pBSxylB (see example 
1). The cells are grpwn in a shaking culture at 37 At an optical density (600 nm) of 0.6 the 
culture is induced with 2 mM IPTG. Two hours after induction with IPTG, 50 ml (0.9 mmol) of 
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crude [3,4,5-^^C3]1-deoxy-D-xylulose (pH 7.0) (see examples 9 and 10), are added. Aliquots of 
25 mi are taken at time intervals of 30 minutes and centrifuged for 20 min at 5,000 rpm and 4 
""C. The cells are washed with water containing 0.9 % NaCI and centrifuged as described 
above. The cells are suspended in 700 ^1 of 20 mM NaF in DjO. cooled on ice and sonified 3 
X 10 sec with a Branson Sonifier 250 (Branson SONIC Power Company) set to 90 % duty 
cycle output, control value of 4. The suspension is centrifuged at 15,000 rpm for 15 min. ^^C 
NMR spectra of the supernatant are recorded directly, without further purification, with a 
Bruker AVANCE DRX 500 spectrometer (Karlsmhe, Germany). The NMR analysis is based on 
published signal assignments (Wungsintaweekul et ai, 2001). 

30 min after the addition of [3,4,5-''^C3]1-deoxy-D-xylulose, the formation of [3,4,5-^^C3]1- 
deoxy-D-xylulose 5-phosphate can be observed. The maximum yield of [3,4,5-^^C3]1-deoxy-D- 
xylulose 5-phosphate is observed 3-5 h after addition of [3,4,5-^^C3]1-deoxy-D-xylulose to the 
medium. The ^^C NMR signals reveal a mixture of [3,4,5-^^C3]1-deoxy-D-xylulose and [3,4,5- 
^^C3]1-deoxy-D-xylulose-5-phosphate at a molar ratio of approximately 1 : 9. The intracellular 
concentration of [3,4,5-^^C3]1-deoxy-D-xylulose 5-phosphate is estimated as 20 mM by 
quantitative NMR spectroscopy. 

Example 12 

Incorporation experiment with recombinant Escherichia co// XL 1-pBSxylBdxr using [U-^^CsJI- 
deoxy-D-xylulose 

0.12 litre of Luria Bertani (LB) medium containing 22 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of E. co// strain XLI-Blue harbouring plasmid pBSxylBdxr (see example 
2). The cells are grown in a shaking culture at 37 **C. At an optical density (600 nm) of 0.6 the 
culture is induced with 2 mM IPTG. Two hours after induction with IPTG, ca. 1.0 mmol of crude 
[U-^CJI-deoxy-D-xylulose (pH 7.0) (see examples 8 and 10) are added. Aliquots of 25 ml are 
taken in time intervals of 1 h and centrifuged for 20 min at 5,000 rpm and 4 ^'C. The cells are 
washed with water containing 0.9 % NaCI and centrifuged as described above. The cells are 
suspended in 700 jil of 20 mM NaF in DjO. cooled on ice and sonified 3x10 sec with a 
Branson Sonifier 250 (Branson SONIC Power Company) set to 90 % duty cycle output, control 
value of 4. The suspension is centrifuged at 15,000 rpm for 15 min. NMR spectra of the 
supematant are recorded directly, without further purification, with a Bruker AVANCE DRX 500 
spectrometer (Karlsruhe. Germany). 
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HMQC and HMQC-TOCSY experiments reveal ^H-'^C and ^H-^H spin systems (Table 1) of [U- 
'^C5]2C-methyl-D-erythritol 4-phosphate, [U-^^Q^jjC-methyl-D-erythritol and [1 .2.2'.3.4-"C5]4- 
diphosphocytidyl-2C-methyl-D-erythritol at a molar ratio of approximately 6.6 : 7 : 1, 
respectively. The intracellular concentration of [U-"C5]2C-methyl-D-erythritol 4-phosphate is 
estimated as 10 mM by quantitative NMR spectroscopy. 

The NMR data summarized in Table 1 are identical with published NMR data of the authentic 
compounds (Takahashi et al., 1998; Rohdich et al., 1999). 



Table 1. NMR data of ^^C-labeled products in cell extracts of E. co// XL1-pBSxylBdxr after 
feeding of [U-^^Csll-deoxy-D-xylulose 





Chemical shifts, ppm 


Position 


1 r 


2 


2-Me\hy\ 


3 


4 4' 


[U-'^Cs]2C-methyl-D-erythritbl 4-phosphate 


"C 


66.1 


n.d. 


18.1 


73.4 


648 




3.25 13.36 




0.93 


3.56 


3.62 13.81 


(U-"C5]2C-methyl-D-erythritol 


13Q 


66.8 


n.d. 


18.0 


74.6 


616 




3.26 1 3.34 




0.9 


3.44 


3.36 13.61 


[1 ,2.2'.3,4-"Cs]4-diphosphocytidyl-2C-methyl-D-erythritol 


"C 


66.8 


n.d. 


18.0 


73.0 


667 




3.4 13.55 




0.9 


3.6 


3.74 |4 



Example 13 

Incorporation experiment with recombinant Escherichia coli XL1-pBSxylBdxrispDF using 
[3.4,5-^^C3]1 -deoxy-D-xylulose 

0.1 litre of Luria Bertani (LB) medium containing 18 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of £. co// strain XLI-Blue harbouring the plasmid pBSxylBdxrispDF (see 
example 4). The cells are grown in a shaking culture at 37 ^'C. At an optical density (600 nm) 
of 0.5, the culture is induced with 2 mM IPTG. Two hours after induction with IPTG, ca. 1.0 
mmol of crude [3,4.5-^^C3]1-deoxy-D-xylulose (see examples 9 and 10) are added. After three 
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hours, cells were harvested and centrifuged for 20 min at 5,000 rpm and 4 ''C. The cells are 
washed with water containing 0.9 % NaCI and centrifuged as described above. The cells are 
suspended in 1.5 ml of 20 mM NaF in DjO, cooled on ice and sonified 3 x 15 sec. with a 
Branson Sonifier 250 (Branson SONIC Power Company) set to 90 % duty cycle output, control 
value of 4. The suspension is centrifuged at 15,000 rpm for 15 min. NMR spectra of the 
supernatant are recorded directly, without further purification, with a Bruker AVANCE DRX 500 
spectrometer (Karlsruhe, Germany). 

HMQC and HMQC-TOCSY experiments reveal ^H-^^C and ^H-^H spin systems of [1.3,4- 
'3C3]2C-methyl-D-erythritol 4-phosphate. [1,3.4-^3C3]2C-methyl-D-erythritol and [1,3,4-^3C5]4- 
diphosphocytidyl-2C-methyl-D-erythritol (Table 1). The molar ratios of [1,3,4-^^C3]2C-methyl-D- 
erythritol 4-phosphate. [1,3,4-^^C3]2C-methyl-D-erythritol and [1,3.4-^^C5]4-dipbosphocytidyl- 
2C-methyl-D-erythritoi are 1 : 0.6 : 0.9. respectively. 

This result indicates that the intracellular amount of CTP required for the synthesis of 4- 
diphosphocytidyl-2C-methyl-D-erythritol is limiting. Therefore, a modified fermentation protocol 
was developed (see example 14). 

Example 14 

Incorporation experiment with recombinant Escherichia coli XLI-pBSxylBdxrispDF using 
I3.4.5-'^C3]1 -deoxy-D-xylulose 

0.1 litre of Luria Bertani (LB) medium containing 18 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of £. coli strain XLI-Blue harbouring plasmid pBSxylBlspDF (see 
example 4). The cells are grown In a shaking culture at 37 ""C. At an optical density (600 nm) 
of 0.5, the culture is induced with 2 mM IPTG. Two hours after induction with IPTG, 10 mg 
(0.041 mmol) of cytidine and 5 ml of 1 M NaKHP04, pH 7.2, and ca. 1 mmol of crude 13,4,5- 
^^C3]1-deoxy-D-xylulose (see examples 9 and 10) are added. After three hours, the cells are 
harvested and centrifuged for 20 min at 5,000 rpm and 4 **C. The cells are washed with water 
containing 0.9 % NaCI and centrifuged as described above. The cells are suspended in 700 \il 
of 20 mM NaF in DjO, cooled on ice and sonified 3x10 sec with a Branson Sonifier 250 
(Branson SONIC Power Company) set to 90 % duty cycle output, control value of 4. The 
suspension is centrifuged at 15,000 rpm for 15 min. NMR spectra of the supernatant are 
recorded directly, without further purification, with a Bruker AVANCE DRX 500 spectrometer. 
HMQC and HMQC-TOCSY experiments reveal 'H-'^C and ^H-^H spin systems (Table 1) of 



F:UB41 56M 56ANM\WB000203 



wo 02/083720 



PCT/EP02/04005 



38 

[1,3,4-^^C3]2C-methyl-D-erythritol 4>phosphate. [1,3,4''^C3]2C-methyl-D-erythritol and [1.3.4- 
*^C3]4-diphosphocytidyl-2C-methyl-D-erythritol at a molar ratio of approximately 1 : 3.4 : 4.2, 
respectively. The relative amount of [1.3,4-^^C3]4-diphosphocytidyl-2C-methyl-D-erythritol is 
increased by a factor of 2 as compared to the relative amount in example 13. The intracellular 
concentration of [1 ,3,4-"C3l4-diphosphocytidyl-2C-methyl-D-erythritol is estimated as 10 mM 
by quantitative NMR spectroscopy. The relative high amount of 2C-methyl-D-erythritol 
.indicates that unspecific phosphatases convert intermediary formed 2C-methyl-D-eryfhritol 4- 
phosphate into 2C-methyl-D-erythritol. Therefore, a modified fermentation protocol was 
developed to supply the cells with sufficient amounts of organic phosphates and in order to 
suppress the activity of phosphatases (see examples 15 to 17). 

Examole 1 5 

Incorporation experiment with recombinant Escherichia co// XL 1-pBScyc!o using [U-^^CJI- 
deoxy-D-xylulose 

0.2 litre of Luria Bertani (LB) medium containing 36 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of E. coli strain XLI-Blue harbouring the plasmid pBScycIo (see 
example 5). The cells are grown in a shaking culture at 37 ''C. At an optical density (600 nm) 
of 1.3, the culture is induced with 2 mM IPTG. Two hours after induction with IPTG, 30 mg 
(0.12 mmol) of cytidine, 300 mg (0.95 mmol) of DL-a-glycerol 3-phosphate and 10 ml of 1 M 
NaKHP04, pH 7.2, are added. After 30 min, ca. 1 mmol of [U-^^CgJI-deoxy-D-xylulose (see 
example 8 and 10) are added. Aliquots of 25 ml are taken at time intervals of 1 h and 
centrifuged for 20 min at 5,000 rpm and 4 ''C. The cells are washed with water containing 0.9 
% NaCI and centrifuged as described above. The cells are suspended in 700 jnl of 20 mM NaF 
in DjO, cooled on ice and sonified 3x10 sec with a Branson Sonifier 250 (Branson SONIC 
Power Company) set to 90 % duty cycle output, control value of 4. The suspension is 
centrifuged at 15,000 rpm for 15 min. NMR spectra of the cell free extract are recorded 
directly, without further purification, with a Bruker AVANCE DRX 500 spectrometer (Karlsruhe. 
Germany). ^^C NMR spectra, as well as HMQC and HMQC-TOCSY spectra established [U- 
^='C5]2C-methyl-D-erythritol 2.4-cyclodiphosphate (Herz et aL, 2000) as the only product. 
Formation of [U-^^C5]2C-methyl-D-erythritol 2,4-cyclodiphosphate can be observed 30 min after 
addition of [U-^^Cgll-deoxy-D-xylulose, whereas the maximum yield is observed 5 h after 
addition of [U-^^CsJI-deoxy-D-xylulose. The intracellular concentration of (U-'^C5]2C-methyl-D- 
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erythritol 2,4-cyclodiphosphate is estimated as 20 mM by quantitative NMR spectroscopy. The 
formation of any other isotope-labelled products, such as [U-^^C5]2C-methyl-erythritol is 
completely suppressed. 

Example 16 

Incorporation experiment with recombinant Escherichia co// XL1-pBScyclo-pACYCgcpE using 
[2-^^C]- and [U-'^CgJI -deoxy-D-xylulose 

0.2 litre of Terrific Broth (TB) medium containing 36 mg of ampicillin and 2.5 mg of 
chloramphenicol are inoculated with the E. coli strain XL1-Blue harbouring the plasmids 
pBScycIo and pACYCgcpE (see example 5 to 6). The cells are grown in a shaking culture at 
37 •C overnight. At an optical density (600 nm) of 4.8 to 5.0. 30 mg (0.1 mmol) of cytidine, 300 
mg (0.94 mmol) of DL-a-glycerol 3-phosphate and 10 ml of 1 M NaKHP04, pH 7.2, are added. 
After 30 minutes, a mixture of 2.6 jimol [2-^*C]1-deoxy-D-xylulose (15 pCi pmoM) 
(Wungsintaweekul et ah, 2001) and 1 ml of crude [U-^^CsJI-deoxy-D-xyluIose (0.02 mmol) (see 
examples 8 and 10) are added. After 1.5 h, cells are harvested and centrifuged for 10 min at 
5,000 rpm and 4 "^C. The cells are washed with water containing 0.9 % NaCI and centrifuged 
as described above. The cells are suspended in a mixture of 20 mM NaF (2 ml) and methanol 
(2 ml), cooled on ice and sonified 3x15 sec with a Branson Sonifier 250 (Branson SONIC 
Power Company) set to 90 % duty cycle output, control value of 4. The suspension is 
centrifuged at 15,000 rpm for 15 min. The radioactivity of the supernatant is measured by 
scintillation counting (Beckmann, LS 7800). 10 % of the radioactivity initially added as ^^C 
labelled 1-deoxy-D-xylulose is detected in the supernatant. Aliquots are analysed by TLC and 
HPLC, as described In example 19, and the products are purified as described in example 20. 
On basis of these data, 1-hydroxy-2-methyl-2-butenyl 4-diphosphate and 2-C-methyl-D- 
erythritol 2,4-cyclodiphosphate were identified as products at a molar ratio of 7 : 3 (see also 
examples 17 and 18). 

Example 17 

incorporation experiment with recombinant Escherichia co// XL 1-pBScyclo-pACYCgcpE using 
[U-'^Cs]. or [3,4.5-'^C3]1-deoxy-D-xylulose 

0.2 litre of Terrific Broth (TB) medium containing 36 mg of ampicillin and 2.5 mg of 
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chloramphenicol are inoculated with the E. coli strain XL1-Blue harbouring the plasmids 
pBScycIo and pACYCgcpE. The cells are grown in a shaking culture at 37 for overnight. At 
an optical density (600 nm) of 4.8 - 5.0, 30 mg (0.1 mmol) of cytidine, 300 mg (0.93 mmol) of 
DL-a-glycerol 3-phosphate and 10 ml of 1 M NaKHP04. PH 7.2, are added. After 30 minutes, 
3 ml of crude [3,4,5-^^03]- or [U-"Cs]1-deoxy-D-xylulose (0.05 mmol) (see examples 8. 9, and 
10) are added. Aliquots of 25 ml are taken at time intervals of 1 h and centrifuged for 20 min 
tat 5,000 rpm and 4 **C. The cells are washed with water containing 0.9 % NaCI and centrifuged 
as described above. The cells are suspended in 700 )al of 20 mM NaF in DjO or in 700 jnl of a 
mixture of methanol and DjO (6:4; v/v) containing 10 mM NaF. cooled on ice and sonified 3 x 
10 sec with a Branson Sonifier 250 (Branson SONIC Power Oompany) set to 90 % duty cycle 
output, control value of 4 output. The suspension is centrifuged at 15,000 rpm for 15 min. NMR 
spectra of the cell free extracts are recorded directly with a Bruker AVANOE DRX 500 
spectrometer (Karlsruhe, Germany). In order to avoid degradation during work-up, the 
structures of the products are determined by NMR spectroscopy without further purification 
(see example 1 8). 

Example 18 

Structure determination of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate 

The ^H-decoupled ^^0 NMR spectrum using [U-^^Cs]1-deoxy-D-xylulose as starting material 
displays 5 ^^C-^^C coupled signals belonging to 2C-methyl-D-erythritol 2,4-cyclodiphosphate 
(Herz et al., 2000) and 5 ^^C-^^C coupled signals at 14.7, 64.5, 68.6. 122.7 and 139.5 ppm 
(Table 2) belonging to an unknown metabolite. The chemical shifts of the unknown metabolite 
suggest a double bond motif (signals at 122.7 and 139.5 ppm), a methyl group (signal at 14.7 
ppm), and two carbon atoms (signals at 64.5 and 68.6 ppm) connected to OR (R=unknown). 
The three signals accounting for carbon atoms with sp^ hybridisation (14.7, 64.5 and 68.5 ppm) 
show ^^C-^^C coupling to one adjacent ^^C atom with coupling constants of 40 - 50 Hz (Table 
2). The signal at 122.7 ppm shows ^^C couplings to two adjacent ^^C neighbours (coupling 
constants, 74 and 50 Hz), whereas the signal at 141.5 ppm shows ^^C couplings to three 
neighboured ^^C atoms (coupling constants, 74, 43 and 43 Hz). In conjunction with the 
chemical shift topology, this coupling signature is indicative for a 2-methyl-2-butenyl skeleton. 
HMQC and HMQC-TOCSY experiments reveal the ^H NMR chemical shifts (Table 2), as well 
as '^C-'H and ^H-'H spin systems (Table 3). More specifically, the ^^C NMR signal at 122.7 
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ppm correlates to a NMR signal at 5.6 ppm which is in the typical chemical shift range for 
H-atoms attached to CC double bonds, whereas the signal at 139.5 ppm gives no ^^C-^H 
correlations. The signals at 64.5 and 68.6 ppm give "C-^H correlations to ^H-signals at 4.5 and 
3.9 ppm, respectively. The methyl signal at 14.7 ppm correlates to a proton signal at 1.5 ppm. 
In connection with ^^C-^^C coupling patterns (Table 2). as well as with ^H-^^C long range 
correlations (HMBC experiment. Table 3), these data establish a 1 .4-dihydroxy-2-methyl-2- 
butenyl system. 

Starting from [3,4,5-^^C3l1-deoxy-D-xylulose as feeding material three signals at 64.5. 68.6 and 
122.7 ppm accounting for atoms 4, 1 and 3, respectively, of the new product are observed. It 
can be concluded that the carbon atoms at 1. 3 and 4 of the new product are biogenetically 
equivalent to the carbon atoms 3. 4 and 5 of [3,4,5-^^C3]1-deoxy-D-xylulose 5-phosphate. 
This coupling topology is similar to the coupling pattern of 2C-methyl-D-erythritol 4-phosphate 
(see example 13) confirming that the new compound is derived via 2C-methyl-D-erythritol 4- 
phosphate. 

The C-4 and C-3 '^C NMR signals at 64.5 and 122.7 ppm show ^^C-^V coupling of 5.5 and 8.0 
Hz, respectively. These couplings indicate the presence of a phosphate or pyrophosphate 
group at position 4 of the 2-methyl-2-butenyl skeleton. 

In line with this observation, the ^H-decoupled ^^P NMR spectrum of the product displays a 
doublet at -9.2 (^^P-^^P coupling constant, 20.9 Hz) and a double-double-doublet at -10.6 ppm 
^3ip.i3Q coupling constants. 5.8 and 7.4 Hz, 3ip.3ip coupling constant, 20.9 Hz). Without ^H- 
decoupling, the ^^P NMR signal at -10.6 ppm is broadened whereas the signal at -9.2 ppm is 
not affected by ^H coupling. The chemical shifts as well as the observed coupling pattern 
confirm the presence of a free diphosphate moiety at position 4. 

In summary, all these data establish the structure as 1-hydroxy-2-methyl-2-butenyl 4- 
dlphosphate. 
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Table 2. NMR data of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate 



PCT/EP02/04005 



Position 


Chemical shifts, ppm 


Coupling constants, Hz 




1H^ 


13Qb 


31 


Jpc 


Jpp 


Jcc 


1 


3.91 










. 5.5", 3.5" 


2 




139.5"*" 








"^'^•3', 43.3". 43.3" 


2-Methyl 


1.51 


14.7" 








'^2-2" 4 o« 4.0" 


3 


5.57 


122.7" 








73.9", 49.8". 4.0" 


4 


4.46 


64.5"" 




5.5" 




49.3", 5.5" 


P^ 






-9.2 




20.9 




P« 






-10.6 


5.8", 7.4" 


20.9 





"referenced to external trimethylsilylpropane sulfonate. 

''referenced to external trimethylsilylpropane sulfonate. 

*^referenced to external 85 % orthophosphoric acid. 

''observed with [1.3,4-^^C3]1-hydroxy-2-methyl-2-butenyl 4-diphosphate 

'observed with [U-^^C5]1-hydroxy-2-nriethyl-2-butenyl 4-diphosphate 



Table 3. Correlation pattern of [1 ,3,4-^^C3]1-hydroxy-2-methy!-2-butenyl 4-diphosphate and of 
[U-^^C5]1-hydroxy-2-methyl-2-butenyl 4-diphosphate 



NMR Con-elation pattern 


Position 


HMQC 


HMQC-TOCSY 


HMBC 


1 


a.b 


^ a.b 


2-methyl". 2" 


2 








2-methyl 


2-methyl'' 


2-methyP 




3 


38.1) 


3" 


2-methyl', 1' 


4 


4a.b 


4". 3".'' 





"observed with [1,3,4-"C3]1-hydroxy-2-methyl-2-butenyl 4-diphosphate 
'^observed with [U-^^C5]1-hydroxy-2-methyl-2-butenyl 4-diphosphate 
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Example 19 

Detection of phosphorylated metabolites of the mevalonate-independent pathway 
Method A) By a TLC method 

Aliquots (10 |jl) of the cell-free extracts from recombinant cells prepared as described above 
(see example 16) are spotted on a Polygram® SIL NH-R thin layer plate (Macherey-Nagel, 
Dtiren. Germany). The TLC plate is then developed in a solvent system of n-propanol: ethyl 
acetate: water; 6:1:3 (v/v/v). The running time is about 4 h. The radio chromatogram is 
monitored and evaluated by a Phosphor Imager (Storm 860, Molecular Dynamics, USA), The 
Rrvalues of the compounds under study are shown in Table 4. 

Table 4: Revalues of precursors and intermediates of the mevalonate-independent terpenoid 
pathway 



Chemical compound Revalue 



1-deoxy-D-xylulose 0.80 

1-deoxy-D-xylulose 5-phosphate 0.5 

2C"methyl-D-erythritol 4-phosphate 0.42 

4-diphosphocytidyl-2C-methyl-D-erythritol 0.33 

4"diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate 0.27 

2C-methyl-D-erythritol 2,4-cyclodiphosphate 0.47 

1 -hydroxy-2-methyl-2-butenyl 4-diphosphate 0.17 



Method B) By a HPLC method 

Aliquots (100 pi) of the cell-free extracts from recombinant cells prepared as described above 
(see example 16), are analyzed by HPLC using a column of Multospher 120 RP 18-AQ-5 (4.6 
X 250 mm, particle size 5 [im, CS-Chromatographic Sen/ice GmbH. Langerwehe, Germany) 
that has been equilibrated for 15 min with 10 mM tetrabutylammonium hydrogensulfate 
(TBAS). pH 6.0. at a flow rate of 0.75 ml min'^ After injection of the sample, the column is 
developed for 20 min with 10 mM TBAS. then for 60 min with a linear gradient of 0 - 42 % (v/v) 
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methanol in 10 mM TBAS. The effluent is monitored by a continuous-flow radio detector (Beta- 
RAM, Biostep GmbH, Jahnsdorf, Germany). The retention volumes of the compounds under 
study are shown in Table 5. 

Table 5: Retention volumes of precursors and intermediates of the mevalonate-independent 
terpenoid pathway 



Chemical compound Retention volume [ml] 



1 -deoxy-D-xylulose 6.0 

1 -deoxy-D-xylulose 5-phosphate 1 5 

2C-methyl-D-erythritol 4-phosphate 1 3.5 

4-diphosphocytidyl-2C-methyl-D-erythritol 30.8 

4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate 41.3 

2C-methyl-D-erythritol 2,4-cyclodiphosphate 31 .5 

1 'hydroxy-2-methyl-2-butenvl 4-diphosphate 42.8 



Example 20 

Purification of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate 

The crude cell free extract obtained from the feeding experiment with recombinant Escherichia 
CO// XL1-pBScyclo-pACYCgcpE using [2-^*C]- and [U-^^CsJI-deoxy-D-xylulose (see example 
16) is lyophilized. The residue is dissolved in 600 |j.l of water and centrifuged for 10 min at 
14,000 ppm. Aliquots of 90 \i\ are applied on a column of Nucleosil 10 SB (4.6 x 250 mm, 
Macherey & Nagel, Duren, Germany) which is developed with a linear gradient of 0.1 - 0.25 M 
ammonium fomriate in 70 ml at a flow rate of 2 ml min \ The retention volumes for 2C-methyl- 
D-erythritol-2,4-cyclodiphosphate and 1-hydroxy-2-methyl-2-butenyl 4-diphosphate are 25 and 
44 ml, respectively. Fractions are collected and lyophilized. NMR data of 1-hydroxy-2-methyl- 
2-butenyl 4-diphosphate are identical with the data shown in example 18, Table 2. 

Example 21 

Construction of a vector carrying the xy/8, dxr, ispD, ispE, ispF and ispG genes of Escherichia 
coli capable for transcription and expression of D-xylulokinase, DXP reductoisomerase, CDP- 
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ME synthase, CDP-ME kinase cMEPP synthase and 1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate synthase 

The E. CO// ORF ispG (accession no. gb AE000338) from base pair (bp) position 372 to 1204 
is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
GCGGGAGACCGCGGGAGGAGAAATTAACCATGCATAACCAGGCTCCAATTCG-3', 10 pmol 
of the primer 5'-CGCTTCCCAGCGGCCGCTTA I I I I I CAACCTGCTGAACG-3\ 20 ng of 
chromosomal DNA, 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total 
volume of 100 pi containing 1.5 mM MgClg, 50 mM KCI, 10 mM Tris-hydrochloride, pH 8.8 and 
0.1 % (w/w) Triton X-100. 

The mixture is denaturated for 3 min at 94 **C. Then 30 PGR cycles for 60 sec at 94 ""G, 60 sec 
at 50 ^'G and 90 sec at 72 **G followed. After further incubation for 10 min at 72 "G, the mixture 
is cooled to 4 ''G. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

1 .4 [ig of the vector pBScycIo (Example 5) and 0.8 pg of the purified PGR product are digested 
with Sacll and Not\ in order to produce DNA fragments with overlapping ends. The restriction 
mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 *G. Digested vector DNA and PGR product are purified using the PGR 
purification kit from Qiagen. 

20 ng of the purified vector DNA and 18 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBScyclogcpE. The ligation mixture is incubated for 2 h at 25 **C. 1 pi of 
the ligation mixture is transformed into electrocompetent E. co// XLI-Blue cells. The plasmid 
pBScyclogcpE is Isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBScyclogcpE is sequenced by the automated 
dideoxynucleotide method using an ABl Prism 377™ DNA sequencer from Perkin Elmer with 
the ABl Prism^^ Sequencing Analysis Software from Applied Biosystems Divisions. It Is 
identical with the DNA sequence of the database entry (gb AE000338). The DNA sequence of 
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the vector construct pBScyclogcpE is shown in Annex H. 
Example 22 

Construction of a vector carrying the xy/S, cfxr, ispD, /spE, ispF, ispG and lytB genes of 
Escherichia coli capable for transcription and expression of D-xylulokinase, DXP 
reductoisomerase, CDP-ME synthase, CDP-ME kinase, cMEPP synthase, 1-hydroxy-2- 
methyl-2-butenyl 4-diphosphate synthase and LytB 

The E. CO// ORF lytB (accession no. gb AE005179) from base pair (bp) position 7504 to 8454 
is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
AAATCGGAGCTCGAGGAGAAATTAACCATGCAGATCCTGTTGGCC-3\ 10 pmol of the 
primer 5 -GCTGCTCCGCGGTTAATCGACTTCACGAATATCG-3'. 20 ng of chromosomal DNA, 
2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 m' 
containing 1.5 mM MgCla. 50 mM KCI, 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 

The mixture is denaturated for 3 min at 94 **C. Then 30 PGR cycles for 45 sec at 94 **G, 45 sec 
at 50 **G and 60 sec at 72 **G followed. After further incubation for 10 min at 72 "^C, the mixture 
is cooled to 4 ^^C. An aliquot of 2 pi Is subjected to agarose gel electrophoresis. 
The PGR amplificate is purified with the PGR purification kit from Qiagen (Hitden). 

1.3 pg of the vector pBScyclogcpE (Example 21) and 0.7 pg of the purified PGR product are 
digested with Sad and Sad! in order to produce DNA fragments with overlapping ends. The 
restriction mixtures are prepared according to the conditions supplied by the customer (NEB) 
and are incubated for 3 h at 37 °C. Digested vector DNA and PGR product are purified using 
the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 16 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBScyclogcpElytB. The ligation mixture is incubated for 2 h at 25 °G. 1 pi 
of the ligation mixture is transformed into electrocompetent E. co// XL 1 -Blue cells. The plasmid 
pBScyclogcpElytB is isolated with the plasmid isolation kit from Qiagen. 



F:\IB41 S6\1 S6ANMNWB000203 



wo 02/083720 



PCT/EP02/04005 



47 

The DNA insert of the plasmid pBScyclogcpElytB is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377^'^ DNA sequencer from Perkin Elmer with . 
the ABI Prism™ Sequencing Analysis Software from Applied Biosystems Divisions. It is 
identical with the DNA sequence of the database entry (gb AE005179). 

Example 23 

Incorporation experiment with recombinant Escherichia coli XL1 -pBScyclogcpE using [U- 
^^CgJI-deoxy-D-xylulose 

0.1 litre of Terrific Broth (TB) medium containing 18 mg of ampicillin are inoculated with E. coli 
strain XII -Blue harbouring the plasmid pBScyclogcpE. The cells are grown in a shaking culture 
at 37 X overnight. At an optical density (600 nm) of 4.8 - 5.0, 30 mg (0.1 mmol) of cytidine are 
added. A solution containing 1.2 g of lithium lactate (12.5 mmol), 6 ml of crude [U-^^CgJI- 
deoxy-D-xylulose (0.05 mmol) (see examples 8, 9 and 10) in 0.1 M Tris hydrochloride (pH=7.5) 
at a final volume of 30 ml are added continously within 2 hours. Aliquots of 25 ml are taken at 
time intervals of 1 h and centrifuged for 20 min at 5,000 rpm and 4 °C. The cells are washed 
with water containing 0.9 % NaCI and centrifuged as described above. The cells are 
suspended in 700 ^1 of 20 mM NaF in D2O or in 700 ^1 of a mixture of methanol and DjO (6:4, 
v/v) containing 10 mM NaF, cooled on ice and sonified 3x10 sec with a Branson Sonifier 250 
(Branson SONIC Power Company) set to 90 % duty cycle output, control value of 4 output. 
The suspension is centrifuged at 15,000 rpm for 15 min. NMR spectra of the cell free extracts 
are recorded directly with a Bruker AVANCE DRX 500 spectrometer (Karisruhe, Germany). In 
order to avoid degradation during work-up, the structures of the products are determined by 
NMR spectroscopy without further purification. 

The relative amount of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate to 2C-methyl-D-erythritoI 
2,4-cyclodiphosphate could be raised by a factor of approximately 2-3 by the addition of lithium 
lactate to the medium. 

Example 24 

Preparation of (£)-1-hydroxy-2-methyl-2-butenyl diphosphate triammonium salt (8) 
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General. Chemicals are obtained from Acros Organics (Fisher Scientific GmbH, Schwerte, 
Germany). SIGMA-ALDRICH (Deisenhofen. Germany), MERCK (Darmstadt, Germany) and 
used without further puriftcation. Solvents are used distilled and/or dried. Chromatography is 
performed on silica gel 60 (230-400 mesh, Fluka Riedel-de Haen. Taufkirchen, Germany), 
DOWEX 50 WX8 (200-400 mesh, SERVA, Heidelberg, Germany), and Cellulose (Avicel, 
Cellulose mikrokristallin, Merck, Darmstadt, Germany). TLC is performed on silica gel 60 F2S4 
plastic sheets (MERCK) or cellulose F plastic sheets (MERCK), detection by anisaldeyde 
solution (anisaldehyde:H2S04:HAc 0.5:1:50 v/v/v). NMR-spectra are recorded on BRUKER 
AMX 400, DRX 500, and AC 250 spectrometer at room temperature. 

Acetonyl tetrahydropyranyl ether (12) (Hagiwara et aL, 1984). 

A mixture of 339 mg (1 .35 mmol) of pyridinium-toluene-4-sulfonate, 9.35 ml (1 0.0 g. 0. 1 35 mol) 
of hydroxyacetone, and 24.7 ml (22.7 g, 0.270 mol) of 3,4-dihydro-2H-pyran is stirred at room 
temperature for 2.5 h. Residual 3,4-dihydro-2H-pyran is removed under reduced pressure. The 
crude mixture is purified by FC on silicagel (hexanes/acetone 4:1 , 6.5 x 20 cm) to yield 18.7 g 
(0.1 18 mol, 88%) of a colorless liquid. 

^H NMR (CDCI3. 500 MHz) 5 4.62 (t. J = 3.6 Hz, 1H), 4.22 (d, J = 17.3 Hz. 1H). 4,09 (d. J = 
17.3 Hz, 1H), 3.83-3.79 (m, 1H), 3.51-3.47 (m, 1H), 2.15 (s, 3H), 1.87-1.49 (m, 6H); ^^C NMR 
(CDCI3. 126 MHz) 5 206.7. 98.7, 72.3. 62.3. 30.2. 26.5. 25.2. 19.2; MS (CI, isobutane) m/z 159 

[M + ir. 

(E,Z)-Ethyl-2-methyM-tetrahydropyranyloxy-but-2-enoate (13) (Watanabe et aL, 1996). 
33.0 g (94.8 mmol) of (ethoxycarbonylmethylen)-triphenylphosphorane are dissolved in 500 ml 
of dry toluene under nitrogen atmosphere at room temperature. Then. 10.0 g (63.2 mmol) of 
acetonyl tetrahydropyranyl ether 12 are added and the mixture is heated to reflux. After 39 h 
at this temperature the solvent is evaporated under reduced pressure to yield an orange oil. 
Major amounts of triphenylphosphinoxide are precipitated by the addition of 100 ml 
hexanes/acetone 9:1. After filtration the filtrate is concentrated and another 100 ml of 
hexanes/acetone 9:1 are added. The solid is filtered off and the solvent removed to yield 18 g 
of an orange oil that is purified by FC on silicagel (hexanes/acetone 9:1. 6.5 x 28 cm) to yield 
12.9 g (56.5 mmol. 89%) of a mixture of (E)-13/(Z).13 = 5:1. 

(E).(13). ^H NMR (CDCI3, 500 MHz) 5 5.96 (q, J = 1.4 Hz, 1H), 4.62 (t. J = 3.5 Hz. 1H), 4.20 
(dd. J= 15.5 Hz. .1.3 Hz. 1H). 4.14 (q, J = 7.1 Hz. 2H), 3.93 (dd. J= 15.6, 1.3 Hz, 1H), 3.84- 
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3.79 (m. 1H). 3.52-3.48 (m, 1H), 2.08 (d, J = 1.4 Hz, 3H). 1.88-1.50 (m, 6H), 1.26 (t. J = 7.1 
Hz, 3H): "C NMR (CDCI3. 126 MHz) 5 166.8, 154.7, 114.5. 98.0, 70.6. 62.0, 59.7, 30.3. 25.3, 
19.1, 15.9, 14.3; MS (CI, isobutane) mfz 229 [M + 1]*. 

(ZH13). ^H NMR (CDCI3, 500 MHz) 5 5.71 (q, J = 1.4 Hz. 1H), 4.60 (t. J - 3.6 Hz, 1H), 4.20 
(dd, J= 15.5 Hz. 1.3 Hz, 1H). 4.11 (q, J = 7.1 Hz. 2H), 3.93 (dd. J= 15.6, 1.3 Hz. 1H), 3.84- 
3.79 (m. 1H), 3.52-3.48 (m. 1H), 1.97 (d. J = 1.4 Hz, 3H). 1.88-1.50 (m, 6H), 1.24 (t, J - 7.1 
Hz, 3H); "C NMR (CDCI3, 126 MHz) S 165.9. 156.8, 116.9, 98.7. 66.5, 62.3, 59.8, 30.6. 25.3, 
21.9, 19.5. 14.3; MS (CI. rsobutane) m/z 229 [M + 1]*. 

(E,Z)-2-Methyl 1-tetrahydropyranyloxy-but-2-ene-4-ol (14) (Watanabe ef a/.. 1996). 
A solution of ester 13 (8.73 g, 38.2 mmol) in ICQ mi of dry QW^X^ is cooled to -78 "C. Then, 
91.8 ml (91.8 mmol) of 1.0 M DIBAH in hexanes are added slowly under an atmosphere of 
nitrogen. The resulting solution is stirred for 3 h at -78 "C before the reaction Is quenched by 
the addtion of 1 .5 ml of 1 M NaOH. After warming to room temperature the solvent is removed 
under reduced pressure. The resulting gummy residue is widely dissolved by adding twice 100 
ml of MeOH. The resulting mixture is passed through a column of SiOj, evaporated from the 
solvent and then loaded on a column of Si02/Na2S04 that is purged with 1400 ml of MeOH. 
Evaporation of the solvent gives 9.5 g of a colorless liquid that is purified by FC on silica gel 
(hexanes/acetone 1:3, 6.5 x 16 cm) to yield 6.98 g (37.4 mmol, 98%) of a colorless liquid (E)- 
14/(Z)-14 = 6:1. 

(EH14). ^H NMR (CDCI3, 500 MHz) 6 5.68 (tq, J = 6.6, 1.3 Hz, 1H), 4.60 (t, J = 3.6 Hz. 1H). 
4.20 (d. J = 6.8 Hz. 2H), 4.12 (d. J = 12.0 Hz, 1H), 3.87-3.82 (m. 1H). 3.85 (d. J = 12.5 Hz, 
1H), 3.52-3.48 (m. 1H), 1.86-1.48 (m. 6H). 1.69 (s. 3H); "C NMR (CDCI3. 126 MHz) 8 135.7. 
125.4, 97.8. 71.9. 62.1, 59.1, 30.5, 25.4, 19.4, 14.1; MS (CI, isobutane) m/z 169 [M - HjO + 

(Z)-(14). ^H NMR (CDCI3. 500 MHz) 6 5.64 (tq, J = 6.6. 1.3 Hz. 1H), 4.63 (t, J= 3.3 Hz. 1H). 
4.20 (d. J = 6.8 Hz, 2H), 4.15 (d, J = 11.8 Hz, 1H), 3.87-3.82 (m, 1H), 3.83 (d. J = 11.3 Hz. 
1H). 3.52-3.48 (m. 1H). 1.86-1.48 (m. 6H). 1.79 (s, 3H); "C NMR (CDCI3, 126 MHz) 6 136.2, 
128.6, 96.6. 65.1, 61.8. 58.1. 30.3. 25.3. 21.9. 19.0; MS (CI. isobutane) m/z 169 [M-HjO+l]*. 

(E,Z)-4-Chloro-2-methyl 1-tetrahydropyranyloxy-but-2-en (15) (Hwang etal., 1984). 

To a solution of alcohol 14 (1 .00 g, 5.37 mmol) in 10 ml of dry CH2CI2 are added 91 8 mg (7.52 

mmol) of DMAP in 10 ml of dry CH2CI2 and 1.23 g (6.44 mmol) of p-TsCI in 10 ml of dry 
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CH2CI2. The resulting solution is stirred at room temperature for 1 h. After evaporation of the 
solvent under reduced pressure the residue is purified by FC on silica gel (CH2CI2, 5 20 cm) 
to obtain 693 mg (3.39 mmol. 63%) of a coloriess liquid (E)-15/(Z)-15 = 6:1 . 
(E)-(15). NMR (CDCI3. 500 MHz) 5 5.77 (tq, J = 8.0, 1.5 Hz. 1H). 4.64 (t. J = 3.6 Hz, 1H), 
4.18 (d. J = 12.8 Hz. 1H), 4.15 (d. J = 8.0 Hz, 2H), 3.92 (d, J = 12.8 Hz. 1H). 3.90-3.86 (m, 
1H). 3.59-3.52 (m, 1H). 1.92-1.52 (m. 6H). 1.77 (s, 3H); ^^C NMR (CDCI3, 126 MHz) 5 138.6. 
121.7. 97.8. 71.3. 62.1. 40.2. 30.5. 25.4. 19.3. 13.9; MS (CI, isobutane) m/z 205 [M + 1]*. 
{Z)-(15). ^H NMR (CDCI3, 500 MHz) 5 5,65 (t, J= 8.1 Hz. 1H). 4.61 (t. J= 3.6 Hz. 1H). 4.18 (d, 
J = 12.8 Hz. 1H). 4.15 (d, J = 8.0 Hz, 2H). 3.92 (d, J= 12.8 Hz. 1H). 3.90-3.86 (m. 1H). 3.59- 
3.52 (m, 1H), 1.92-1.52 (m, 6H), 1.86 (s, 3H); NMR (CDCI3, 126 MHz) 5 138.3. 124.6, 97.5. 
64.7. 62.2. 40.1. 30.5. 25.4.- 21. 8, 19.4; MS (CI. isobutane) m/z 205 [M + 1]*. 

(E,Z)-2-Methyl 1-tetrahydropyranyloxy-but-2-enyl diphosphate triammonium salt (16) 

(Davisson ef a/., 1986). 

To a solution of chloride 15 (260 mg. 1.27 mmol) in 1.3 ml of MeCN a solution of 1.38 g (1.52 
mmol) tris(tetra-n-butylammonium) hydrogen pyrophosphate in 3.0 ml of MeCN is added 
slowly at room temperature, obtaining an orange-red solution. The reaction is followed by ^H- 
NMR, taking advantage of the up field shift of the multiplet of H-3. After 2 h the reaction is 
finished and the solvent removed under reduced pressure. The orange oil is dissolved in 2.5 
ml of H2O and passed through a column of DOWEX 50 WX8 (2.5 x 3 cm) cation-exchange 
resin (NH/ form) that has been equilibrated with two column volumes (40 ml) of 25 mM 
NH4HCO3. The column is eluted with 60 ml of 25 mM NH4HCO3. The resulting solution is 
lyophilized, dissolved in 5 ml of isopropanol/100 mM NH4HCO3 1:1 and loaded on a cellulose 
column (2 X 18 cm) that is eluted by isopropanol/100 mM NH4HCO3 1:1, The effluent is 
lyophilized obtaining 495 mg (1.25 mmol. 98%) of (E)-16/(Z)-16 = 6:1 as a white solid. 
(E)-(16). ^H NMR (D2O. 500 MHz) 5 5.52 (tq, J = 6.8 Hz, 1H). 4.65 (s, 1H), 4.34 (t, J = 7.0 Hz. 
2H). 3.98 (d. J =12.3 Hz. 1H). 3.84 (d. J =12.1 Hz. 1H), 3.74-3.70 (m. 1H). 3.42-3.38 (m, 1H). 
1.61-1.57 (m. 2H). 1.54 (s. 3H). 1.40-1.32 (m. 4H); ^^C NMR (DjO. 126 MHz) 5 136.4. 123.9 
(dd. J = 8.0, 2.3 Hz). 98.5, 72.5, 63.2. 62.2 (d, J = 5.3 Hz). 29.9. 24.5. 19,0. 13.4; ^'P NMR 
(D2O, 101 MHz) 5 -5.62 (d, J = 20.9 Hz). -7.57 (d. J = 20.8 Hz). 

(Z)-(16). 'H NMR (D2O. 500 MHz) 5 5.52 (t, J = 6.8, 1H), 4.65 (s. 1H). 4.31 (t, J= 7.1 Hz, 2H). 
3.98 (d. J= 12.3 Hz. 1H). 3.84 (d. J= 12.1 Hz. 1H). 3.74-3.70 (m. 1H). 3.42.3.38(m, 1H). 1.64 
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(s, 3H), 1.61-1.57 (m. 2H). 1.40-1.32 (m. 4H); "C NMR (DjO. 126 MHz) 8 136.3, 125.8 (d. J = 
8.6 Hz), 98.6. 72.5. 63.2. 61.8 (d. J = 5.1 Hz), 29.9, 24.5, 20.8, 19.0; =»'P NMR (DjO, 101 MHz) 
5 -5.69 (d, J = 20.8 Hz), -7.68 (d, J = 20.8 Hz). 

(E,Z)-1-Hydroxy-2-methyl-but-2-enyl diphosphate triammonium salt (8) (Davisson et ai, 
1986). 

268 mg (0.675 mmol) of protected pyrophosphate 16 are dissolved in 2.0 ml of DjO and the pH 
is adjusted to 1 by addition of 40 pi of 37% DCI in DjO. After 1 min at this pH the solution is 
neutralized by addition of 40 pi of 40% NaOD in D2O and an 'H NMR is measured that 
demonstrated 50% deprotection. The procedure is repeated until deprotection is finished and 
just small amounts . of decomposition product are formed to get in total 7 min at pH 1. 
Purification is performed by loading the neutral solution that is diluted by addition of 2 ml of 
isopropanol/100 mM NH4HCO3 1:1 on a cellulose column (isopropanol/100 mM NH4HCO3 1:1, 
2 X 10.5 cm) to yield 193 mg (0.616 mmol. 91%) of a white solid of (E)-8/(Z)-8 = 7:1. 
(E)-(8). NMR (D2O. 500 MHz) 6 5.51 (tq, J = 6.8. 1.2 Hz, 1H), 4.41 (t. J = 7.2 Hz, 2H), 3.90 
(s, 2H), 1.59 (s, 3H); "C NMR (DjO. 126 MHz) 8 139.8. 120.6 (d. J = 7.7 Hz), 66.5. 62.4 (d. J 
= 5.3 Hz). 13.2; ^'P NMR (DjO, 101 MHz) 8 -4.48 (d, J = 20.8 Hz). -7.06 (d. J = 20.8 Hz). 
(Z)-<8). 'H NMR (DjO. 500 MHz) S 5.49 (tm, J = 6.8 Hz. 1H), 4.41 (t, J = 7.3 Hz. 2H). 4.03 
(s, 2H). 1.70 (s. 3H); "C NMR (DjO. 126 MHz) 8 139.8, 123.5 (d. J = 7.7 Hz). 61.7 (d. J = 
5.1 Hz), 59.9, 20.6; ''P NMR (DjO, 101 MHz) 8 -4.48 (d, J = 20.8 Hz), -7.06 (d, J = 20.8 
Hz). 

Reagents and conditions (steps (a) to (f) in Fig. 4: 1): (a) DHP. PPTS, 25 "C (2.5 h); (b) 
PhaPCHCOaEt. toluene, reflux (39 h); (c) (1) DIBAH. CH2CI2. -78 'C (3 h), (2) 1 M NaOH/HaO; 
(d) p-TsCI. DMAP. CH2CI2. 25 "C (1 h); (e) ((CH3CH2CH2CH2)4N)3HP207, MeCN. 25 "C (2 h); 
(f), HCI/H2O pH 1. 25 "C (7 min). 

Example 25 

Identification of (E)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate 

The structure of the GcpE product is further analyzed by comparison with the chemical shifts 
of a synthetic sample of 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate. 
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For this purpose. [2-^^C]1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate (0.36 pCi) is added 
to a cell extract obtained from bioengineered Escherichia co// cells endowed with artificial gene 
constructs expressing xy/B, ispC, ispD, ispE, ispF and gcpE gene which are supplied with [U- 
^^Cgl-l-deoxy-D-xylulose (see example 16). The supernatant of the cell extract is purified by 
HPLC (Nucleosil 5 SB, 7.5 x 250 mm, developed with a gradient of 100 mM to 250 mM 
NH4HCOO, flow rate 2 ml/min, 35 min). The product is eluted at 23 min and collected. After 
lyophilization the residue is dissolved in D2O (pH 6) and subjected to NMR analysis (Figure 
3-A). 

Then, 40 pi of a solution of synthetically prepared (E,Z)-1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate (E/Z = 7:1) (D2O. pH 7) are added to the NMR sample and again analyzed by 
NMR spectroscopy (Figure 3-B). On the one hand, as shown in Figure 3-B. signals accounting 
for (E)-1-hydroxy-2-methyl-2-butenyl are selectively increased, providing evidence that the 
biologically produced structure is identical with the synthetically produced one, i.e. the (E)- 
isomer. On the other hand, the minor (Z)-isomer raises without any correlation to signals of the 
biologically afforded product. Figure 3-C shows the same effects after addtion of another 40 |jl 
of solution of the synthetically prepared (E,Z)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate. 

Example 26 

Incorporation of (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate into the lipid soluble fraction 
of Capsicum annuum chromoplasts. 

Chromoplasts are isolated by a slight modification of a method described by Camara (Camara, 
1985; Camara. 1993). Pericarp of red pepper (650 g) is homogenized at 4 °C in 600 ml of 50 
mM Hepes, pH 8.0, containing 1 mM DTE, 1 mM EDTA and 0.4 M sucrose (buffer A). The 
suspension is filtered through four layers of nylon cloth (50 pm) and centrifuged (1 0 min, 4,500 
rpm, GSA rotor) to obtain a pellet of crude chromoplasts which is homogenized in 200 ml of 
buffer A. The suspension is centrifuged (10 min. 4,500 rpm, GSA rotor). The pellet is 
homogenized and resuspended in 3 ml of 50 mM Hepes. pH 7.6, containing 1 mM DTE. The 
suspension is filtered through one layer of nylon cloth (50 pm). 

Reaction mixtures contain 100 mM Hepes. pH 7.6, 2 mM MnClj. 10 mM MgClj. 5 mM NaF, 2 
mM NADP*, 1 mM NADPH, 6 mM ATP, 20 pM FAD and 2 mg of chromoplasts. 8.8 nmol of [2- 
^^C12C-methyl-D-erythritol 2.4-cyclodiphosphate, [2-^^C]1-hydroxy-2-methyl-2-(E)-butenyl 
diphosphate or [2-^'^C]isopentenyl diphosphate (specific concentrations 15.8 pCi/pmol) are 
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added and the mixtures are incubated at 30 ""C overnight. The reaction is terminated by 
methylene chloride extraction. The organic phase is concentrated under a stream of nitrogen. 
Aliquots are spotted on silica gel plates (Polygram SIL-G, UV254, Macherey-Nagel, DQren, 
Germany). The plates are developed with hexane : ether = 6: 1 (system I) and/or hexane : 
toluene = 9:1 (system II), respectively. The chromatograms are monitored with a phosphor 
imager (Storm 860. Molecular dynamics, Sunnyvale, CA, USA). The Rrvalues of 
geranylgeraniol and the carotene fraction in system I are 0.35 and 0.9, respectively. The FV 
values of p-carotene, phytoene and phytofluene in system II are 0.65, 0,60 and 0.55, 
respectively. 

The evaluation of the chromatogramms show that radioactivity can be efficiently diverted from 
1-hydroxy-2-methyl-2-(E)-butenyl diphosphate into the geranylgeraniol, (3-carotene. phytoene 
and phytofluene fractions of C. annuum chromoplasts establishing 1-hydroxy-2-methyl-2-(E)- 
butenyl diphosphate as a real intermediate of the non-mevalonate pathway downstream from 
2C-methyl-D-erythritol 2,4-cyclodiphosphate and upstream from isopentenyl diphosphate. 

Example 27 

Construction of a vector carrying the ispG igcpE) and ispH (/ytS) genes of Escherichia coli 
capable for transcription and expression thereof 

The E. CO// ORF ispH (lytB) (accession no. gb AE0001 13) from base pair (bp) position 5618 to 
6568 is amplified by PGR using chromosomal £. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
GCTTGCGTCGACGAGGAGAAATTAACCATGCAGATCCTGTTGGCCACC-3\ 10 pmol of the 
primer 5'-GCTGCTCGGCCGTTAATCGACTTCACGAATATCG-3\ 20 ng of chromosomal DNA, 
2 U of Taq DNA polynrierase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 pi 
containing 1.5 mM MgClj. 50 mM KCI, 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 

The mixture Is denaturated for 3 min at 94 **C. Then 30 PGR cycles for 45 sec at 94 ^C, 45 sec 
at 50 •'G and 60 sec at 72 ""C followed. After further incubation for 10 min at 72 "^C, the mixture 
is cooled to 4 "G. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 
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2.4 (jg of the vector pACYC184 (Chang and Cohen 1978. NEB) and 0.7 |jg of the purified PCR 
product are digested with Sa/I and Eag\ in order to produce DNA fragments with overlapping 
ends. The restriction mixtures are prepared according to the conditions supplied by the 
customer (NEB) and are incubated for 3 h at 37 'C. Digested vector DNA and PCR product are 
purified using the PCR purification kit from Qiagen. 

20 ng of the purified vector DNA and 18 ng of the purified PCR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 |jl of T4-Ligase buffer (Gibco) in a total volume of 10 m'. 
yielding the plasmid pACYCIytB. The ligation mixture is incubated for 2 h at 25 "^C. 1 pi of the 
ligation mixture is transformed into electrocompetent E. coli XLI-Blue cells. The plasmid 
pACYCIytB is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pACYCIytB is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377"^" DNA sequencer from Perkin Elmer with the ABI Prisma 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb AE0001 1 3). 

The E. CO// ORF ispG (gcpE) (accession no. gb AE000338) from base pair (bp) position 372 to 
1204 is amplified by PCR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'- 
CGTACCGGATCCGAGGAGAAATTAACCATGCATAACCAGGCTCCAATTC-3'. 10 pmol of the 
primer 5*-CCCATCGTCGACTTATTTTTCAACCTGCTGAACGTC-3\ 20 ng of chromosomal 
DNA. 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 
100 pi containing 1.5 mM MgClg. 50 mM KCI, 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % 
(w/w) Triton X-1 00. 

The mixture is denaturated for 3 min at 94 ^C. Then 30 PCR cycles for 60 sec at 94 **C, 60 sec 
at 50 X and 90 sec at 72 ^'C followed. After further incubation for 10 min at 72 X, the mixture 
is cooled to 4 **C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
The PCR amplificate is purified with the PCR purification kit from Qiagen (Hilden). 

2.0 pg of the vector pACYCIytB and 0.9 mQ of the purified PCR product are digested with 
8amHI and Sa/I in order to produce DNA fragments with overiapping ends. The restriction 
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mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 X. Digested vector DNA and PGR product are purified using the PGR 
purification kit from Qiagen. 

20 ng of the purified vector DNA and 23 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi. 
yielding the plasmid pAGYCIytBgcpE. The ligation mixture is incubated for 2 h at 25 ""G. 1 pi of 
the ligation mixture is transformed into electrocompetent E. co// XL1-Blue cells. The plasmid 
pAGYGIytBgcpE is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pACYGIytBgcpE is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377^^ DNA sequencer from Perkin Elmer with 
the ABI Prism™ Sequencing Analysis Software from Applied Biosystems Divisions. It is 
identical with the DNA sequence of the database entry (gb AE000338). 
The DNA sequence of the vector construct pACYCIytBgcpE is shown in Annex L 

The DNA and corresponding amino acid sequence of ispH (lytB) from Escherichia coli is 
shown in Annex J. 

Examole 28 

Gonstruction of a vector carrying the xy/B, dxr, ispD, /spE, /spF, ispG and ispH genes of 
Escherichia coli capable for transcription and expression of D-xylulokinase, DXP 
reductolsomerase, GDP-ME synthase. GDP-ME kinase cMEPP synthase, 1-hydroxy-2-methyl- 
2-butenyl 4-diphosphate synthase and IPP/DMAPP synthase 

The E. CO// ORFs ispG (fomierly gcpE) and ispH (formerly lytB) are amplified by PGR using the 
plasmid pAGYGIytBgcpE (see example 27) as template. The reaction mixture contains 10 pmol 
of the primer 5'- 

GGGGGAGAGGGGGGGAGGAGAAATTAAGGATGGATAAGG AGGGTGGAATTGAAGG\ 1 0 
pmol of the primer 5-AGGGTGGGGGGGGGTTAATGGAGTTGAGGAATATGG-3', 2 ng of 
pAGYGgcpElytB DNA. 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs In a 
total volume of 100 pl containing 1.5 mM MgGlj. 50 mM KGI. 10 mM Tris-hydrochloride. pH 8.8 
and 0.1 % (w/w) Triton X-100. 
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The mixture is denaturated for 3 min at 94 "C. Then 30 PGR cycles for 60 sec at 94 ^'C, 60 sec 
at 50 ""C and 150 sec at 72 X followed. After further incubation for 20 min at 72 X. the 
mixture is cooled to 4 ''C. An aliquot of 2 \}\ is subjected to agarose gel electrophoresis. 

The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

t 

1 .7 |jg of the vector pBScycIo (Example 5) and 1 .3 pg of the purified PGR product are digested 
with Sacll and Not\ in order to produce DNA fragments with overlapping ends. The restriction 
mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 **G. Digested vector DNA and PGR product are purified using the PGR 
purification kit from Qiagen. 

22 ng of the purified vector DNA and 19 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ugase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pBScyclogcpElytB2. The ligation mixture is incubated for 2 h at 25 ""G. 1 
pi of the ligation mixture is transformed into electrocompetent E. coli XLI-Blue cells. The 
plasmid pBScyclogcpElytB2 is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pBScyclogcpElytB2 is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377^" DNA sequencer from Perkin Elmer with 
the ABI Prism^" Sequencing Analysis Software from Applied Biosystems Divisions. The DNA 
sequence of the vector construct pBScyclogcpElytB2 is shown in Annex K. 

Example 29 

Incorporation experiment with recombinant Escherichia co// XL1-pBScyclogcpElytB2 using [U- 
^^GsJI -deoxy-D-xylulose 

0.1 litre of Terrific Broth (TB) medium containing 18 mg of ampicillin are inoculated with E. coli 
strain XII -Blue harbouring the plasmid pBScyclogcpElytB2. The cells are grown in a shaking 
culture at 37 **G for overnight. At an optical density (600 nm) of 1.3-1.7 a solution containing 
2.4 g of lithium lactate (25 mmol). 10 ml of crude [U-^CsJI-deoxy-D-xylulose (0.05 mmol) (see 
example 8) at a final volume of 30 ml (pH=7.4) are added continously within 2 hours. Aliquots 
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of 40 ml are taken at time intervals of 30 minutes and centrifuged for 20 min at 5,000 rpm and 
4 ""C. The cells are washed with water containing 0.9 % NaCI and centrifuged as described 
above. The cells are suspended in 700 ml of a mixture of methanol and DjO (6:4, v/v) 
containing 10 mM NaF, cooled on ice and sonified 3x10 sec with a Branson Sonifier 250 
(Branson SONIC Power Company) set to 90 % duty cycle output, control value of 4 output. 
The suspension is centrifuged at 15,000 rpm for 15 min. NMR spectra of the cell free extracts 
are recorded directly with a Bruker AVANCE DRX 500 spectrometer (Karlsmhe, Germany). In 
order to avoid degradation during work-up, the structures of the products are determined by 
NMR spectroscopy without further purification. 

Example 30 

Structure determination of isopentenyl diphosphate (iPP) and dimethylallyl diphosphate 
(DMAPP) 

The ^H-decoupled ^'C NMR spectmm using [U-^CgJI-deoxy-D-xylulose as starting material 
(see examples 8 and 30) displays five intense ^C-^^C coupled signals belonging to 2C-methyl- 
D-erythritol 2,4-cyclodiphosphate (Herz et al., 2000) and five ^^C-^^C coupled signals with low 
intensities belonging to 1-hydroxy-2-methyl-2-butenyl 4-diphosphate (see example 18) (100:3 
ratio for the 2-methyl ^^C NMR signal intensities of 2C-methyl-D-erythritol 2,4-cyclodiphosphate 
and 1-hydroxy-2-methyl-2-butenyl 4-diphosphate, respectively). 

In addition a set of . five ^^C-^^C coupled signals at 21.6 (doublet), 37.8 (triplet), 64.1 (doublet). 
111.6 (doublet), and 143,3 ppm (doublet of triplets) (unknown metabolite A) accompanied by 
signals at 21 .1 (doublet). 39.6 (triplet), 59.3 (doublet), 111.8 (doublet), and 143.2 ppm (doublet 
of triplets) (unknown metabolite B) is detected. The ratio of the 2-methyl signal of 2C-methyl-D- 
erythritol 2,4-cyclodiphosphate and the putative methyl signals of the unknown compounds at 
21.6 ppm (metabolite A) and 21.1 ppm (metabolite B) is 100:24:4. respectively. 

Moreover, ^^C coupled signals with low intensities belonging to another unknown compound 
(metabolite C) at 17.1 (doublet), 24.9 (doublet). 62.7 (doublet). 119.6 (double-doublet) and 
139.4 ppm (multiplet) are detected. The ratio of the intensities of the putative methyl signals at 
21.6 (metabolite A). 17.1 and 24.9 (metabolite C) is 100:13:13. respectively. 

The NMR spectrum of the reaction mixture is characterized by intense signals for 2C- 
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methyl-D-erythritol 2.4-cyclodiphosphate (Herz et al., 2000). Furthermore, 3ip3ip coupled 
broadened signals are observed at a chemical shift range typical for organic diphosphates (-6 
to -13 ppm, 3ip3ip coupling constants. 20 Hz). 

Metabolite A: 

The signals of metabolite A at 1 1 1 .6 and 143.3 ppm are conducive of a double bond motif, and 
the signals at 64.1. 37.8 and 21.6 ppm reflect three aliphatic carbon atoms one of which 
(signal at 64,1 ppm) appears to be connected to OH or OR (R=unknown). 
Additional information about the structure of the unknown metabolite A can be gleaned from 
the "C coupling pattem. Three of the ^^C NMR signals (21.6. 64.1 and 111.6 ppm) are split 
into doublets indicating three ^^C atoms each connected to only one ^^C-labelled neighbour, 
one signal (37.8 ppm) displays a pseudo-triplet signature indicating a ^^C atom with two 
adjacent ^^C atoms, and one signal (143.3 ppm) is split into a doublet of triplets indicating a ^^C 
atom with three ^^C connections. In conjunction with the chemical shifts, this connectivity 
pattern establish metabolite A as an isopentenyl derivative. 

The complex signature for the signal at 143.3 ppm deserves a more detailed analysis. The 
large coupling (71 Hz) is typical for ^^C^^C couplings between carbon atoms involved in carbon- 
carbon double bonds. A 71 Hz coupling is also found for the doublet signal at 111.6 ppm 
representing the second carbon of the double bond. Due to the coupling pattern and the 
chemical shifts the presence of an exo-methylene function is obvious. The two additional ^^C 
couplings found in the triplet substructure of the signal at 143.3 ppm are both 41 Hz, and 
establish the respective carbon as the branching point of the structure. 
HMQC experiments reveal the ^H NMR chemical shifts, as well as ^^C-^H and ^-''H spin 
systems. More specifically, the ^^C NMR signal at 1 1 1 .6 ppm correlates to a ^H NMR signal at 
4.73 ppm, whereas the signal at 143.3 ppm gives no "C-''H correlation. The signals at 64.1, 
37.8, and 21.6 ppm give ^^C-^H correlations to ^H-signals at 4.00, 2.31, and 1.68 ppm, 
respectively. As shown by HMQC-TOCSY experiments, the proton signals at 2.31 and 4.00 
are coupled, whereas the signals at 4.73 and 1.68 ppm are found as singlets in the HMQC- 
TOCSY experiment. The observed ^H NMR chemical shifts in combination with the coupling 
patterns demonstrate that metabolite A is an isopentenyl derivative with a single bonded 
heteroatom (most plausibly O) at position 1 . 

The ^^C and ^H chemical shifts of an authentic sample of isopentenyl diphosphate (IPP. 
measured in the same solvent mixture) are identical to the chemical shifts assigned to 
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metabolite A. Thus, metabolite A is identified as [U-^^CsJIPP. 
Metabolite B: 

As noted above, the coupling and correlation pattern of metabolite B observed in the ^^C NMR 
signals, as well as in the HMQC and HMQC-TOCSY spectra, is virtually the same as for 
metabolite A (IPP) suggesting that the carbon connectivities of metabolite B and IPP are 
identical. As the most significant difference between the NMR data of metabolite B and IPP the 
^^C NMR chemical shift of one doublet signal for metabolite B (59.3 ppm) corresponding to the 
C-1 signal of IPP (64.1 ppm) is upfield shifted by 4.9 ppm. This suggests that a phosphate 
moiety is missing at C-1 in metabolite B. Therefore, metabolite B is assigned as [U- 
"CsJisopentene-l-ol. Presumably, isopentene-1-ol is formed from IPP by the catalytic action 
of pyrophosphatases and phosphatases present in the experimental system. 

Metabolite C: 

As described above for metabolite A (IPP). the structure of metabolite C is assigned by NMR 
analysis. The ^^C coupling pattern of the signals attributed to metabolite C (three doublets, one 
double-doublet, one multiplet) suggests that the compound is another isopentane derivative. 
The chemical shifts observed for the double-doublet (1 19.6 ppm) and the multiplet (139.4 ppm) 
show that a carbon-carbon double bond connects C-2 (coupled to two ^^C neighbours) and C-3 
(coupled to three ^^C neighbours) of the molecule. 

The 'H NMR chemical shifts of metabolite C are revealed by HMQC and HMQC-TOCSY 
experiments showing two singlets at 1 .75 and 1 .71 ppm, and a spin system comprising signals 
at 5.43 and 4.45 ppm. In conjunction with the chemical shifts, this correlation pattern shows 
that metabolite C is a dimethylallyl derivative. 

The ^^C and NMR chemical shifts of an authentic sample of dimethylallyl diphosphate 
(DMAPP) are identical to the chemical shifts of the signals attributed to metabolite C. This 
leaves no doubt that metabolite C is [U-^^Cjldimethylallyl diphosphate (DMAPP). 

The NMR data of metabolite A (IPP) and metabolite C (DMAPP) are summarized in Tables 6 
and 7. 
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Table 6. NMR data of isopentenyl diphosphate (IPP) 



Position 


Chemical shifts, ppm 


Coupling constants, Hz 






13Qb 


31 

pc 


J PC 




Jpp 




1 <• 


1 


4.00 


64.1 




4.9 


6.6 




6.6 


34 


2 


2.31 


37.8 




8.0 


6.7 






40. 40 


3 




143.3 












71. 41.41 


4 


4.73 


111.6 












71 


5 


1.68 


21.6 












41 


P 






-7.8 






nd 






P 






- 11.9 






19,5 







^referenced to external trimethylsilylpropane sulfonate, 
^referenced to external trimethylsilylpropane sulfonate. 
*^referenced to external 85 % orthophosphoric acid, 
^'observed with [U-^'CslIPP 



Table 7. NMR data of dimethylallyl diphosphate (DMAPP) 



Position 


Chemical shifts, ppm 


Coupling constants. Hz 






"C" 


"P' 




Jhh 


:l££ 




Jcoj 


1 


4.45 


62.7 




3.6 


6.6 




6.6 


47 


2 


5.43 


119.6 




9.0 


7.2 






75, 48 


3 




139.4 












nd 


4 


1.75 


24.9 












42 


5 


1.71 


17.1 












41 


P 






-9.1 






21.7 






P 






-6.4 






21.5 







^referenced to external trimethylsilylpropane sulfonate, 
''referenced to external trimethylsilylpropane sulfonate, 
^referenced to external 85 % orthophosphoric acid, 
^observed with [U-'^CglDMAPP 



Example 31 

Cloning of the ispG gene (fragment) from Arabidopsis thaliana 

RNA is isolated from 1 g of 2 weeks old Arabidopsis thaliana var. Columbia plants (stems and 
leafs) by published procedures (Logemann et al. 1987). 
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A mixture containing 2.75 |jg RNA. 50 nmol dNTP's, 1 pg random hexameric primer, 1 pg Tig- 
primer and 20 % first strand 5x buffer (Promega) in a total volume of 50 pi is incubated for 5 
min. at 95 **C, cooled on ice and 500 U M-MLV reverse transcriptase (Promega) are added. 
The mixture is incubated for 1 h at 42 **C. After incubation at 92 'C for 5 min. RNase A (20 U) 
and RNase H (2 U) are added and the mixture is incubated for 30 min. at 37 ''C. 

The resulting cDNA (1 pi of this mixture) is used for the amplification of ispG by PCR. 

The A. thaliana ORF ispG (accession no. dbj AB005246) without the coding region for the 
putative leader sequence from basepair (bp) position 2889 to 6476 is amplified by PCR using 
cDNA froom A. thaliana as template. The reaction mixture contains 25 pmol of primer 
CCTGCATCCGAAGGAAGCCC, 25 pmol of primer CAGTTTTCAAAGAATGGCCC. 1 pi of 
cDNA. 2 U of Taq DMA polymerase (Eurogentec. Seraing. Belgium) and 20 nmol of dNTPs in 
a total volume of 100 mI in 1.5 mM MgClz, 50 mM KCI, 10 mM Tris-hydrochloride. pH 8.8 and 
0.1 % (w/w) Triton X-100. 

The mixture is denaturated for 3 min at 95 **C. Then 40 PCR cycles for 60 sec at 94 *C, 60 sec 
at 50 **C and 90 sec at 72 °C followed. After further incubation for 20 min at 72 **C, the mixture 
is cooled to 4 ''C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. The PCR 
amplificate is purified with the PCR purification kit from Qiagen. 1 .7 pg of purified PCR product 
are obtained. 

The PCR amplificate is used as template for a second PCR reaction. The reaction mixture 
contains 25 pmol of primer TGAATCAGGATCCAAGACGGTGAGAAGG, 25 pmol of primer 
TCCGTTTGGTACCCTACTCATCAGCCACGG. 2 pi of the first PCR amplification . 2 U of Taq 
DNA polymerase (Eurogentec, Seraing, Belgium) and 20 nmol of dNTPs in a total volume of 
100 pi containing 1.5 mM MgClj. 50 mM KCI, 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % 
(w/w) Triton X-100. 

The mixture is denaturated for 3 min at 95 X. Then 40 PCR cycles for 60 sec at 94 '^C. 60 sec 
at 50 ""C and 90 sec at 72 **C follow. After further incubation for 20 min at 72 ''C, the mixture is 
cooled to 4 An aliquot of 2 pi is subjected to agarose gel electrophoresis. An aliquot of 2 pi 
is subjected to agarose gel electrophoresis. 
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The PGR amplificate is purified with PGR purification kit from Qiagen. 1 .4 pg of purified PGR 
product are obtained. 2.0 pg of the vector pQE30 and 1 .4 pg of the purified PGR product are 
digested with BamHl and Kpnl in order to produce cohesive ends. The restriction mixtures are 
prepared according to the conditions supplied by the customer (NEB) and are incubated for 3 
h at 37 **G. Digested vector DNA and PGR product are purified using the PGR purification kit 
from Qiagen. 

20 ng of vector DNA and 12 ng of PGR product are ligated together with 1 U of T4-Ligase 
(Gibco). 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, yielding the plasmid 
pQEgcpEara. The ligation mixture is incubated over night at 4 ""C. 2 pi of the ligation mixture 
is transformed into electrocompetent £. co// XLI-Blue and M15[pREP4] (Zamenhof et al,, 
1972) cells. The plasmid pQEgcpEara is isolated as described above. 7 pg of plasmid DNA are 
obtained. 

The DNA insert of the plasmid pQEgcpEara is sequenced as described above. The DNA 
sequence is found not to be identical with the sequence in the database (accession no. dbj 
AB005246. see Annex L). 

Example 32 

Screening of IspG (GcpE) enzyme activity 

0.2 g cells of XL1-pAGYGiytBgcpE are suspended in 1 ml 50 mM Tris hydrochloride, pH 7.4 
and 2 mM DTT, cooled on ice and sonified 3x7 sec with a Branson Sonifier 250 (Branson 
SONIG Power Gompany) set to 80 % duty cycle output, control value of 4 output. The 
suspension is centrifuged at 14000 rpm for 15 minutes. The supernatant is used as crude cell 
extract in assays described as follows. 

The assay mixture contains 100 mM Tris hydrochloride, pH 7.4, 1.2 mM dithiothreitol. 10 mM 
NaF, 1 mM GoGIg, 2 mM NADH, 20 mM (18 ^iGi mol"') [2-^'*C]2C-methyl-erythritol 2,4- 
cyclodiphophate. 0.5 mM pamidronate and 100 ^1 crude cell extract of XLI-pAGYGIytBgcpE in 
a total volume of 150 ^1. The mixture is incubated for 10 to 45 min at 37 ''G and cooled on ice. 
10 ^1 of 30 % (g/v) trichloroacetic acid are added and the mixture is neutralized with 20 ^1 of 1 
M NaOH. The mixture is centrifuged at 14.000 rpm for 10 minutes. Aliquotes of 130 ^1 of the 
supernatant are analyzed by reversed-phase ion-pair HPLG using a column of Multospher 120 
RP 18-AQ-5 (4.6 x 250 mm, GS-Ghromatographie Service GmbH, Langerwehe. Germany). 
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The column is developed with a linear gradient of 7-21 % (v/v) methanol in 20 ml of 10 mM 
tetra-n-butylammonium hydrogen phosphate, pH 6.0 at a flow rate of 1 ml min"^ and further 
with a linear gradient of 21-49 % (v/v) methanol in 15 ml of 10 mM tetra-n-butylammonium 
hydrogen phosphate, pH 6.0. After washing the column with 49 % (v/v) methanol in 5 ml of 10 
mM tetra-n-butylammonium hydrogen phosphate, pH 6.0, the column is equilibrated with 7 % 
(v/v) methanol in 20 ml of 10 mM tetra-n-butylammonium hydrogen phosphate, pH 6.0. The 
effluent Is monitored by a continous-flow radio detector (Beta-RAM, Biostep GmbH, Jahnsdorf, 
Germany). The retention volumes of 2C-methyl-erythritol 2,4-cyclodiphophate, 1-hydroxy-2- 
methyl-2-(E)-butenyl-4-diphosphate, DMAPP/ IPP are 18, 24 and 39 ml respectively. 

After 10 minutes of incubation, about 13 % of 2C-methyl-erythritol 2,4-cyclodiphophate have 
been converted into 1-hydroxy-2-methyl-2-(£)-butenyl-4-diphosphate (5 %) and into 
DMAPP/IPP (8 %), respectively. 

After 45 min, no 1-hydroxy-2-methyl-2-(£)-butenyl-4-diphosphate, but about 21 % of 
DMAPP/IPP was found in the assay mixture. 

Example 33 

Screening of IspH (LytB) activity 

Assay mixtures contain 100 mM Tris hydrochloride, pH 7,4 1.2 mM DTT, 10 mM NaF, 0.5 mM 
NADH. 60 pM FAD, 0.004 pM (18 pCi [2-^^C]1-hydroxy-2-methyl-2-(£)-butenyl-4- 
diphosphate, 0.5 mM pamidronate (Dunford etaL, 2001) and 20 pi of crude cell extract of Ml 5- 
pMALIytB cells (prepared as described in example 2) in a total volume of 150 pi. The mixture 
is incubated for 30 min at 37 ""C. The reaction is terminated by cooling on ice, addition of 10 pi 
of 30 % (g/v) trichloroacetic acid and immediate neutralization with 20 pi 1 M sodium 
hydroxide. The mixtures are centrifuged and aliquots (130 pi) of the supernatant are analyzed 
by reversed-phase ion-pair HPLC using a column of Multospher 120 RP 18-AQ-5 (4.6 x 250 
mm, CS-Chromatographie Service GmbH, Langerwehe, Germany) analyzed by reversed- 
phase ion-pair HPLC using a column of Multospher 120 RP 18-AQ-5 (4.6 x 250 mm, CS- 
Chromatographie Service GmbH, LangenA/ehe, Germany). The column is developed with a 
linear gradient of 7-21 % (v/v) methanol in 20 ml of 10 mM tetra-n-butylammonium hydrogen 
phosphate. pH 6.0 at a flow rate of 1 ml min'^ and further with a linear gradient of 21 -49 % (v/v) 
methanol in 15 ml of 10 mM tetra-n-butylammonium hydrogen phosphate, pH 6.0. After 
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washing the column with 49 % (v/v) methanol in 5 ml of 10 mM tetra-n-butylammonium 
hydrogen phosphate, pH 6.0, the column is equilibrated with 7 % (v/v) methanol in 20 ml of 10 
mM tetra-n-butylammonium hydrogen phosphate, pH 6.0. The effluent is monitored by a 
continous-flow radio detector (Beta-RAM. Biostep GmbH, Jahnsdorf. Germany). 

Under standard assay conditions, the HPLC peak corresponding to the substrate 1 -hydroxy-2- 
methyl-2-(£)-butenyl 4-diphosphate is completely diminished, whereas two new peaks 
corresponding to DMAPP and IPP appear, when crude cell extract of E. coli M15-pMALIytB 
cells is used as protein source. No conversion of 1-hydroxy-2-methyl-2-(£)-butenyl 4- 
diphosphate into DMAPP and IPP can be observed, when crude cell extract of E. co// wild-type 
is used as protein source. This findings clearly show that the FAD and NADH- or NADPH- 
dependent conversion of 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate into DMAPP and 
IPP is catalyzed by the recombinant LytB protein. The addition of pamidronate in the assay 
mixtures prevents a further metabolization of IPP and DMAPP by highly active prenyl 
transferases present in crude E. co// extracts and affects therefore the complete conversion of 
1-hydroxy-2-methyl-2-(£)-butenyl 4-diphosphate into DMAPP and IPP. 

Example 34 

Construction of a vector carrying the dxs, xylB and ispC genes capable for the transcription 
and expression thereof 

The S. subitis ORF dxs (accession no. dbj D84432) from base pair (bp) position 193991 to 
195892 is amplified by PGR using pBSDXSBACSU plasmid DNA as template (see patent 
application PCT/E POO/07548). The reaction mixture contains 10 pmol of the primer 5- 
GGCGACTCGCGAGAGGAGAAATTAACCATGGATCTTTTATCAATACAGGACC-3\ 10 pmol 
of the primer 5'.GGCACCCGGCCGTCATGATCCAATTCCTTTGTGTG-3\ 20 ng DNA of 
pBSDXSBACSU plasmid, 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in 
a total volume of 100 pi containing 1.5 mM MgClj. 50 mM KCI, 10 mM Tris-hydrochloride, pH 
8.8 and 0.1 % (w/w) Triton X-100. 

The mixture is denaturated for 3 min at 94 **C. Then 30 PCR cycles for 60 sec at 94 **C, 60 sec 
at 50 X and 120 sec at 72 ''C followed. After further incubation for 10 min at 72 •'C, the 
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mixture is cooled to 4 °C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
The PGR amplificate is purified with the PGR purification kit from Qiagen (Hllden). 

2.4 M9 of the vector pAGYG184 (Ghang and Gohen 1978, NEB) and 1.8 pg of the purified PGR 
product are digested with Nru\ and EagI in order to produce DNA fragments with overlapping 
ends. The restriction mixtures are prepared according to the conditions supplied by the 
customer (NEB) and are incubated for 3 h at 37 ''G. Digested vector DNA and PGR product are 
purified using the PGR purification kit from Qiagen. 

20 ng of the purified vector DNA and 19 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pAGYGdxs. The ligation mixture is incubated for 2 h at 25 **G. 1 pi of the 
ligation mixture is transformed into electrocompetent E. coli XL1«Blue cells. The plasmid 
pAGYGdxs is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pAGYGdxs is sequenced by the automated dideoxynucleotide 
method using an ABI Prism 377^*^ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (dbj D84432). 

2.0 pg of the vector pAGYGdxs and 8 pg of the vector pBScycIo (see example XXx) are 
digested with EagI and Sa/I in order to produce DNA fragments with overiapping ends. The 
restriction mixtures are prepared according to the conditions supplied by the customer (NEB) 
and are incubated for 3 h at 37 ^'G. Digested vector DNA and PGR product are purified using 
the PGR purification kit from Qiagen. 

20 ng of the digested and purified pAGYGdxs vector DNA and 30 ng of a by DNA 
electrophoresis separated and purified 2.7 kb Eagl/Sa/I fragment (containing the ORFs xylB 
and ispC from E. coli) are ligated together with 1 U of T4-Ligase (Gibco). 2 pi of T4-Ligase 
buffer (Gibco) in a total volume of 10 pi. yielding the plasmid pAGYGdxsxylBispG. The ligation 
mixture is incubated for 2 h at 25 ""G. 1 pi of the ligation mixture is transformed into 
electrocompetent E. co//XL1-Blue cells. The plasmid pAGYGdxsxylBispG is isolated with the 
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plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pACYCdxsxylBispC is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377™ DNA sequencer from Perkin Elmer with 
the ABI Prism™ Sequencing Analysis Software from Applied Biosystems Divisions. 

Example 35 

Construction of a vectors carrying the dxs, xylB, /spC, and ispG and optionally ispH genes 
capable for the transcription and expression thereof 

The E. coli ORF ispH (lytB) (accession no. gb AE0001 13) from base pair (bp) position 5618 to 
6568 is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5*- 
GCTTGCGTCGACGAGGAGAAATTAACCATGCAGATCCTGTTGGCCACC-3\ 10 pmol of the 
primer 5'-GCTGCTCTCGAGTTAATCGACTTCACGAATATCG-3*. 20 ng of chromosomal DNA, 
2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 pi 
containing 1.5 mM MgClj, 50 mM KCI. 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 

The mixture is denaturated for 3 min at 94 ^'C. Then 30 PGR cycles for 45 sec at 94 **G. 45 sec 
at 50 X and 60 sec at 72 X followed. After further incubation for 10 min at 72 **C, the mixture 
is cooled to 4 ^'G. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

2.5 M9 of the vector pAGYGdxsxylBispG (see example 34) are linearized with Sa/I and 0.9 pg 
of the purified PGR product are digested with Sa/I and Xhol in order to produce DNA fragments 
with overlapping ends. The restriction mixtures are prepared according to the conditions 
supplied by the customer (NEB) and are incubated for 3 h at 37 ''G. Digested vector DNA and 
PGR product are purified using the PGR purification kit from Qiagen. 

15 ng of the purified vector DNA and 18 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 |jl of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmid pAGYGdxsxylBispGlytB. The ligation mixture is incubated for 2 h at 25 ''G. 
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1 pi of the ligation mixture is transformed into electrocompetent E. co// XLI-Blue cells. The 
plasmid pACYCdxsxylBispClytB is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pACYCdxsxylBispClytB is sequenced by the automated 
dideoxynucleotide method using an ABI Prism 377^" DNA sequencer from Perkin Elmer with 
the ABI Prism™ Sequencing Analysis Software from Applied Biosystems Divisions. It is 
identical with the DNA sequence of the database entry (gb AE0001 13). 

The E. CO// ORF ispG (gcpE) (accession no. gb AE000338) from base pair (bp) position 372 to 
1204 is amplified by PGR using chromosomal E. coli DNA as template. The reaction mixture 
contains 10 pmoj of the primer 5'- 
GGTCGAGTCGACGAGGAGAAATTAACCATGCATAACCAGGCTCCAATTC-3\ 10 pmol of 
the primer 5*-CCCATCCTCGAGTTATTTTTCAACCTGCTGAACGTC-3\ 20 ng of chromosomal 
DNA, 2 U of Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 
100 pi containing 1.5 mM MgClj. 50 mM KCI. 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % 
(w/w) Triton X-1 00. 

The mixture is denaturated for 3 min at 94 **C. Then 30 PCR cycles for 60 sec at 94 "^C. 60 sec 
at 50 *C and 90 sec at 72 *C followed. After further incubation for 10 min at 72 'C, the mixture 
is cooled to 4 ''C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 
The PCR amplificate is purified with the PCR purification kit from Qiagen (Hilden). 

Each 2.0 pg of the vectors pACYCdxsxylBlspC (see example 34) and pACYCdxsxylBispClytB 
(see above) are linearized with Sa/I and 1.1 pg of the purified PCR product are digested with 
Sa/I and Xho\ in order to produce DNA fragments with overlapping ends. The restriction 
mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 ^'C. Digested vector DNA and PCR product are purified using the PCR 
purification kit from Qiagen. 

18 ng of the purified vector DNAs and 23 ng of the purified PCR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi, 
yielding the plasmids pACYCdxsxylBlspCgcpE and pACYCdxsxylBispClytBgcpE. The ligation 
mixtures are incubated for 2 h at 25 ""C. 1 pi of the ligation mixture is transformed into 
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electrocompetent £. coli XL1-Blue cells. The plasmids pACYCdxsxylBispCgcpE and 
pACYCdxsxylBispClytBgcpE are isolated with the plasmid isolation kit from Qiagen. 

The DNA inserts of the plasmids pACYCdxsxylBispCgcpE and pACYCdxsxylBispClytBgcpE 
are sequenced by the automated dideoxynucleotide method using an ABI Prism 377™ DNA 
sequencer from Perkin Elmer with the ABI Prism™ Sequencing Analysis Software from Applied 
Biosystems Divisions. They are identical with the DNA sequence of the database entry (gb 
AE000338). 

Example 36 

Incorporation experiment with recombinant Escherichia coli XL1 -pACYCdxsxylBispCgcpE 
using [U-^^Cslglucose 

0.2 litre of Terrific Broth (TB) medium containing 5 mg of chloramphenicol are inoculated with 
E. CO// strain XL1-Blue harbouring the plasmid pACYCdxsxylBispCgcpE. The cells are grown 
in a shaking culture at 37 **C overnight. At an optical density (600 nm) of 1.7-2.4 a solution 
containing 1 g of lithium lactate (10 mmol), 200 mg [U-^^Celglucose (1.1 mmol) at a final 
volume of 24 ml (pH=7.4) are added continously within 2 hours. Then, after 1 hour an aliquot 
of 40 ml was taken and centrifuged for 20 min at 5,000 rpm and 4 °C. The cells are washed 
with water containing 0,9 % NaCI and centrifuged as described above. The cells are 
suspended in 700 p.! of a mixture of methanol-d4 and DjO (6:4, v/v) containing 10 mM NaF, 
cooled on ice and sonified 3x10 sec with a Branson Sonifier 250 (Branson SONIC Power 
Company) set to 90 % duty cycle output, control value of 4 output. The suspension is 
centrifuged at 15,000 rpm for 15 min. NMR spectra of the cell free extracts are recorded 
directly with a Bruker AVANCE DRX 500 spectrometer (Karlsruhe, Germany), In order to avoid 
degradation during work-up. the structures of the products are determined by NMR 
spectroscopy without further purification. 

The ^^C-NMR spectra showed signals accounting for 1-hydroxy-2-methyl-2-(£)-butenyl 4- 
diphosphate (cf. Tables 2 and 4, example 18) as major product. A formation of 2C-methyl-D- 
erythritol 2,4-cylodiphosphate could not be detected. 
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Example 37 

Incorporation experiment with recombinant Escherichia co// XL1-pACYCdxsxylBispClytBgcpE 
using glucose 

Example 36 can be carried out with recombinant Escherichia coli XL1- 
pACYCdxsxylBispClytBgcpE using glucose for converting glucose to isopentenyl diphosphate 
and/or dimethylallyl diphosphate. | 

Example 38 

Cloning of the ispG gene of Escherichia coli and expression as maltose binding fusion protein 
(MBP-lspG) 

The E. CO// ORF ispG (gcpE) (accession no. gb AE000338) from base pair (bp) position 372 to 

1204 is amplified by PCR using chromosomal E. coli DNA as template. The reaction mixture 

contains 10 pmol of the primer 5*-GAACCGGAATTCATGCATAACCAGGCTCCAATTC-3', 10 

pmol of the primer 5-CGAGGCGGATCCCATCACG-3\ 20 ng of chromosomal DNA, 2 U of \ 

Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 pi 

containing 1.5 mM MgClj, 50 mM KCI. 10 mM Tris-hydrochloride. pH 8.8 and 0.1 % (w/w) 

Triton X-1 00. ' 

The mixture is denaturated for 3 min at 94 **C. Then 30 PCR cycles for 60 sec at 94 "^C, 60 sec 
at 50 and 90 sec at 72 followed. After further incubation for 10 min at 72 **C. the mixture 
is cooled to 4 ""C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PCR amplificate is purified with the PCR purification kit from Qiagen (Hilden). 

2.2 Mg of the vector pMAL-C2 (NEB) and 0.8 |jg of the purified PCR product are digested with 
EcoRl and SamHI in order to produce DNA fragments with overlapping ends. The restriction 
mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 **C. Digested vector DNA and PCR product are purified using the PCR 
purification kit from Qiagen. 

20 ng of the purified vector DNA and 1 5 ng of the purified PCR product are ligated together 
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with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 |j|, 
yielding the plasmid pMALgcpE. The ligation mixture is incubated for 2 h at 25 **C. 1 |jI of the 
ligation mixture is transformed into electrocompetent E. coli XL1-Blue cells. The plasmid 
pMALgcpE is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pMALgcpE is sequenced by the automated dideoxynucleotide 
.method using an ABI Prism 377™ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb AE000338). 

Example 39 

Cloning of the ispH gene of Escherichia coli and expression as maltose binding fusion protein 
(MBP-lspH) 

The E. coli ORF ispH (lytB) (accession no. gb AE0001 1 3) from base pair (bp) position 561 8 to 
6568 is amplified by PGR using chromosomal £ coli DNA as template. The reaction mixture 
contains 10 pmol of the primer 5'-TGGAGGGGATCCATGCAGATCCTGTTGGCCACC-3\ 10 
pmol of the primer 5-GCATTTCTGCAGAACTTAGGC-3', 20 ng of chromosomal DNA. 2 U of 
Taq DNA polymerase (Eurogentec) and 20 nmol of dNTPs in a total volume of 100 pi 
containing 1.5 mM MgCia, 50 mM KCI. 10 mM Tris-hydrochloride, pH 8.8 and 0.1 % (w/w) 
Triton X-1 00. 

The mixture is denaturated for 3 min at 94 *'C. Then 30 PGR cycles for 45 sec at 94 **G, 45 sec 
at 50 ''G and 60 sec at 72 **G followed. After further incubation for 10 min at 72 **G, the mixture 
is cooled to 4 ""C. An aliquot of 2 pi is subjected to agarose gel electrophoresis. 

The PGR amplificate is purified with the PGR purification kit from Qiagen (Hilden). 

2.2 pg of the vector pMAL-G2 (NEB) and 0.7 pg of the purified PGR product are digested with 
BamHl and Psfl in order to produce DNA fragments with overlapping ends. The restriction 
mixtures are prepared according to the conditions supplied by the customer (NEB) and are 
incubated for 3 h at 37 °G. Digested vector DNA and PGR product are purified using the PGR 
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purification kit from Qiagen. 

20 ng of the purified vector DNA and 14 ng of the purified PGR product are ligated together 
with 1 U of T4-Ligase (Gibco), 2 pi of T4-Ligase buffer (Gibco) in a total volume of 10 pi. 
yielding the plasmid pMALIytB. The ligation mixture is incubated for 2 h at 25 ''C. 1 pi of the 
ligation mixture Is transformed into electrocompetent E. coli XLI-Blue cells. The plasmid 
pMallytB is isolated with the plasmid isolation kit from Qiagen. 

The DNA insert of the plasmid pMALIytB is sequenced by the automated dideoxy nucleotide 
method using an ABI Prism 377™ DNA sequencer from Perkin Elmer with the ABI Prism™ 
Sequencing Analysis Software from Applied Biosystems Divisions. It is identical with the DNA 
sequence of the database entry (gb AE0001 13). 

Example 40 

Preparation and purification of recombinant IspG maltose binding fusion protein (MRP-lspG) 

0.5 liter of Luria Bertani (LB) medium containing 90 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of E. co// strain XLI-Blue harboring plasmid pMALgcpE. The culture is 
grown in a shaking culture at 37 °C. At an optical density (600 nm) of 0.7, the culture is 
induced with 2 mM IPTG. The culture is grown for further 5 h. The cells are harvested by 
centrifugation for 20 min at 5,000 rpm and 4 °C. The cells are washed with 20 mM Tris 
hydrochloride pH 7.4. centrifuged as above and frozen at -20 ^'C for storage. 

2 g of the cells are thawed in 20 ml of 20 mM Tris hydrochloride pH 7.4, 0.2 M sodium chloride 
and 0.02 % (g/v) sodium acide (buffer A) in the presence of 1 mg ml'^ lysozyme and 100 pg ml' 
^ DNasel. The mixture is incubated at 37 ""C for 30 min, cooled on ice and sonified 6x10 sec 
with a Branson Sonifier 250 (Branson SONIC Power Company) set to 70 % duty cycle output, 
control value of 4 output. The suspension is centrifuged at 15. 000 rpm at 4 ®C for 30 min. The 
cell free extract is applied on a column of amylose resin FF (column volume 25 ml, NEB) 
previously equilibrated with buffer A at a fiowrate of 2 ml min~\ The column is washed with 130 
ml of buffer A. MRP-lspG is eluted with a linear gradient of 0-10 mM maltose in buffer A. MRP- 
lspG containing fractions are combined according to SDS-PAGE and dialyzed overnight 
against 100 mM Tris hydrochloride pH 7.4. The homogeneity of MRP-lspG is judged by SDS- 

F:\IB4l56V156ANM\WB0a0203 



wo 02/083720 



PCT/EP02/04005 



72 

PAGE. One band at 84 kDa is visible, which is in line with the calculated molecular mass. The 
yield of pure MRP-lspG is 9 mg. 

Example 41 

Preparation and purification of recombinant IspH maltose binding fusion protein (MRP-lspH) 

.0.5 liter of Luria Bertani (LB) medium containing 90 mg of ampicillin are inoculated with 10 ml 
of an overnight culture of E. coli strain XLI-Blue harboring plasmid pMALIytB. The culture is 
grown in a shaking culture at 37 X. At an optical density (600 nm) of 0.7, the culture is 
induced with 2 mM IPTG. The culture is grown for further 5 h. The cells are harvested by 
centrifugation for 20 min at 5,000 rpm and 4 °C. The cells are washed with 20 mM Tris 
hydrochloride pH 7.4, centrifuged as iabove and frozen at -20 X for storage. 

2 g of the cells are thawed in 20 ml of 20 mM Tris hydrochloride pH 7.4, 0.2 M sodium 
chloride and 0.02 % (g/v) sodium acide (buffer A) in the presence of 1 mg ml"^ lysozyme and 
100 pg mM DNaseL The mixture is incubated at 37 °C for 30 min, cooled on ice and sonified 
6x10 sec with a Branson Sonifier 250 (Branson SONIC Power Company) set to 70 % duty 
cycle output, control value of 4 output. The suspension is centrifuged at 15, 000 rpm at 4 **C 
for 30 min. The cell free extract is applied on a column of amylose resin FF (column volume 
25 ml, NEB) previously equilibrated with buffer A at a flowrate of 2 ml min'\ The column is 
washed with 130 ml of buffer A. MRP-lspH is eluted with a linear gradient of 0-10 mM 
maltose in buffer A. MRP-lspH containing fractions are combined according to SDS-PAGE 
and dialyzed overnight against 100 mM Tris hydrochloride pH 7.4. The homogeneity of 
MRP-lspH is judged by SDS-PAGE. One band at 78 kDa is visible, which is in line with the 
calculated molecular mass. The yield of pure MRP-lspH is 14 mg. 

Example 42 

Synthesis of 1-hydroxy-2-methyl-but-2-enyl-4-diphosphate (see Fig. 7) 

4-Chloro-2-methyl-2-buten-1-al (Choi et al. (1999) J. Org. Chem, 64, 8051-8053) 

A solution containing 1.17 ml of 2'methyl-2-vinyl-oxirane (12 mmol), 1.6 g of CuClj (12 mmol) 

and 510 mg of LiCI (12 mmol) in 10 ml of ethylactetate was heated to 80 **C for 30 min. The 
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reaction was stopped by adding 50 g of ice. The mixture was filtered through a sintered glass 
funnel under reduced pressure. 100 ml of CH2CI2 was added and the organic phase was 
separated. The aqueous layer was extracted two times with 100 ml of CH2CI2. The combined 
organic phase was dried over anhydrous MgS04, filtered, and concentrated. The crude product 
was purified by chromatography over silica gel (CHgClj. 3 x 37 cm) to yield 0.755 g of a yellow 
liquid (6.4 mmol, 53 %). 

NMR (CDCI3, 500 MHz) 5 9.43 (s. 1H). 6.50 (t. J= 7.5 Hz. 1H). 4.24 (d, J=7,5 Hz. 2H). 
1.77 (s. 3H) 

"C NMR (CDCI3. 125 MHz) 5 194.3, 145.7, 141.1. 38.6. 9.1 
4-Chloro-2-methyl-2-buten-1-al-dimethyl-acetal 

A solution of 184 mg 4-chloro-2-methyl-2-buten-1-al (1.55 mmol). 600 pi of HC(OMe)3 (5.6 
mmol) and a catalytic amount of p-TsOH was incubated for 3 h at roomtemperature. The crude 
mixture was purified by chromatography over silica gel (n-hexane/ethylacetate 7:3) to yield 1 77 
mg of a colourless liquid (1.08 mmol, 72 %). 

^H NMR (CDCI3, 500 MHz) 5 5.78 (t. 1H. J=7.9). 4.47 (s. 1H), 4.15 (d, J=7.9 Hz. 2H). 3.33 
(s. 6H). 1.73(s, 3H) 

'^C NMR (CDCI3 125 MHz) 6 137.6, 124.4. 106.0. 53.5. 39.6. 11.4 

(E)-3-Formyl-2-buten-1 -diphosphate triammonium salt (Davisson et al. (1986) J. Org, 
Chem., 51, 4768) 

To a solution of 4-chloro-2-methyl-2-buten-1-al-dimethyl-acetal chloride (25 mg. 0.15 mmol) in 
250 pi of MeCN a solution of 0.162 g (0.18 mmol) of tris(tetra-n-butylammonium) hydrogen 
pyrophosphate in 400 pL of MeCN was added slowly at room temperature, leading to an 
orange-red solution. After 2 h the reaction was finished and the solvent was removed under 
reduced pressure. The orange oil was dissolved in 3 mL of HjO and passed through a column 
of DOWEX 50 WX8 (1x4 cm) cation-exchange resin (NH/ form) that has been equilibrated 
with 20 mL of 25 mM NH4HCO3. The column was eluted with 20 mL of 25 mM NH4HCO3. The 
resulting solution was lyophilized. The obtained solid was dissolved in 2 ml water and acidified 
with aqueouus HCI to pH=3. After 2 minutes the solution was neutralized and lyophylisized. 
^H NMR (D2O. 360 MHz) 8 9.37 (s. 1H), 6.86 (t. 1H. 5.6Hz), 4.85 (dd. J=7.9, J=5.8 Hz, 2H), 
1.72 (s. 3H) 

'^C NMR (D2O, 90 MHz) 5 199.2. 153.1 (d. J=7.5 Hz), 138.5. 63.2 (d, J=4.9). 8.5 
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[1 -^H]1 -hydroxy-2-methy l-but-2-enyl-4-€liphosphate 

A solution containing 50 mCi (15 Mmol) NaBHaT. 15 |jmol 3-formyl-2-buten-1 -diphosphate 
triammonium salt and 100 mM Tris/HCI pH=8 was incubated for 30 minutes at room 
temperature. The solution was acidified by adding 1 M HCI to pH=2. After 2 minutes the 
solution was neutralizied by adding 1 M NaOH. 

The product was characterizied by ion-exchange chromatography (see examples 20 and 
25). 

Examole 43 

yS T cell stimulation assays 

PBMCs from healthy donors (donor A and donor B) are isolated from heparinized peripheral 
blood by density centrifugation over Ficoll-Hypaque (Amersham Pharmacia Biotech, Freiburg, 
Germany). 5^10^ PBMCs/well are cultivated in 1 mL RPMI 1640 medium supplemented with 
10% human AB serum (Klinik rechts der Isar, Miinchen, Germany). 2 mM L-glutamine, 10 mM 
mercaptoethanol. Amounts of recombinant human IL-2 (kindly provided by Eurocetus, 
Amsterdam, The Netherlands) and substrates are varied from. 1 to 10 U and 10 to 0.1 pM, 
respectively. 20 pM IPP (Echelon, Research Laboratories Inc., Salt Lake City. USA) serves as 
a positive control whereas medium alone serves as negative control. Incubation is done for 
seven days at 37 ^'C in the presence of 7% COj. The harvested cells are double-stained with 
fluorescein isothiocyanate (FITC)-conjugated mouse anti-human monoclonal antibody VS2 
TCP and phycoerythrin (PE)-conjugated monoclonal CD3 antibody. The cells are analyzed 
using a FACScan supported with Cellquest (Becton Dickinson, Heidelberg, Germany). 
The substrates (E)-1-hydroxy-3-methyl-but-2-enyl 4-diphosphate (HMBPP) and 3-formyl-but-2- 
enyl 1 -diphosphate (Aldehyd) were prepared synthetically as described above. 
It is found that both synthetically prepared substrates (HMBPP and Aldehyd) show at least 
double stimulation compared to IPP when used at a concentration that is 200-fold lower than 
the concentration of the IPP sample (Table 8). 
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Table 8. Activation of yST-cells by phosphororganic compounds 



Substrate 


Concentration 


IL-2 


% yST-cells 




[uM] 


[U] 


Donor A 


Donor B 


Medium 




1 


1.51 


2.75 


Medium 




5 


1.45 


2.23 


Medium 




10 


1.32 


1.68 


IPP 


20 


1 


8.19 


6.24 


IPP 


20 


5 


14.42 


9.32 


IPP 


20 


10 


16.6 


11.86 


IPP 


1 


1 


1.56 


2.22 


IPP 


1 


5 


1,59 


2.67 


IPP 


1 


10 


1.71 


2.19 


IPP 


0.1 


1 


1.3 


2.15 


IPP 


0.1 


5 


1.3 


2.26 


IPP 


0.1 


10 


1.01 


2.54 


HMBPP 


10 


1 


3.3 


31.42 


HMBPP 


10 


5 


17.38 


63.48 


HMBPP 


10 


10 


24.94 


63.34 


HMBPP 


1 


1 


5.57 


35.34 


HMBPP 


1 


5 


14.4 


54.12 


HMBPP 


1 


10 


19.85 


55.90 


HMBPP 


0.1 


1 


11.78 


32.21 


HMBPP 


0.1 


5 


22.92 


44.69 


HMBPP 


0.1 


10 


34.69 


36.33 


IJIU|Rpp/|pp 


0.5/0.5 


1 


7 


30.35 


HMBPP/IPP 


0.5/0.5 


5 


15.38 


53.76 


HMBPP/IPP 

1 IIVIL^I i #11 1 


0.5/0.5 


10 


24.19 


46.58 


Aldehvd 


10 


1 


12.19 


30.69 


Aldehyd 


10 


5 


34.69 


30.33 


Aldehyd 


10 


10 


38.99 


38.85 


Aldehyd 


1 


1 


15.91 


21.18 


Aldehyd 


1 


5 


40.13 


30.76 


Aldehyd 


1 


10 


48.28 


36.69 


Aldehyd 


0.1 


1 


10 


13.64 


Aldehyd 


0.1 


5 


19.77 


18.45 


Aldehyd 


0.1 


10 


21.93 


25.82 


Aldehyd/IPP 


0.5/0.5 


1 


13.98 


22.11 


Aldehyd/IPP 


0.5/0.5 


5 


33.94 


32.06 


Aldehyd/IPP 


0.5/0.6 


10 


42.84 


36.25 



IPP: isopentenyl diphosphate 

HMBPP: {£)-1-hydroxy-3-methyl-but-2-enyl 4-diphosphate 

Aldehyd: (£)-3-formyl-but-2-enyl 1 -diphosphate (prepared according to example 42) 



Example 44 

High through-put screening assay of 1-hydroxy-2-methyi-2-(£)-butenyl 4-diphosphate 
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synthase (IspG) activity 

Assay mixtures contain 20 mM potassium phosphate, pH 7.0, 0.4 mM NADH, 0.5 mM 
C0CI2, 0.2 mM 2C-methyl-D-erythritol 2,4-cyclodiphosphate, and 50 pi protein in a total 
volume of 1 ml. The mixtures are incubated at 37 **C. The oxidation of NADH is monitored 
photometrically at 340 nm. Alternatively, the concentration of NADH Is determined by 
•measuring the relative fluorescence of NADH at 340 nm excitation/460 nm emission. 

Example 45 

High through-put screening assay of 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate 
reductase (ispH) activity 

Assay mixtures contain 20 mM potassium phosphate, pH 7.0. pH 8.0, 0.4 mM NADH. 20 pM 
FAD, 0.5 mM C0CI2, 0.2 mM 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate. and 50 pi 
protein in a total volume of 1 ml. The mixtures are incubated at 37 *'C. The oxidation of 
NADH is monitored photometrically at 340 nm. Alternatively, the concentration of NADH is 
determined by measuring the relative fluorescence of NADH at 340 nm excitation/460 nm 
emission. 
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Claims 

1 . A protein in a fonn that is functional for the enzymatic conversion of 2C-methyl-D- 
erythritol 2,4-cyclodiphosphate to 1-hydroxy-2-methyl-2-butenyl 4-diphosphate 
notably in its (E)-form. 

2. The protein according to claim 1 , wherein it is functional for said conversion in the 
presence of NADH and/or NADPH. 

3. The protein according to claim 2, wherein it is functional for said conversion in the 
presence of Co^*. 

4. The protein according to one of claims 1 to 3, wherein it has a sequence encoded by 
the ispG (formerly gcpE) gene of E. coli or a function-conservative homologue of said 
sequence. 

5. A protein in a form that is functional for the enzymatic conversion of 1-hydroxy-2- 
methyl-2-butenyl 4-diphosphate, notably in its (£)-form, to isopentenyl diphosphate 
and/or dimethylallyl diphosphate. 

6. The protein according to claim 5, wherein it is in a form functional for said conversion 
in the presence of FAD and NAD(P)H. 

7. The protein according to claim 6, wherein it is in a form functional for said conversion 
in the presence of a metal ion selected from the group of manganese, iron, cobalt, or 
nickel ion. 

8. The protein according to one of claims 5 to 7, wherein it has a sequence encoded by 
the ispH (formerly lytB) gene of E. coli or a function-conservative homologue of said 
sequence. 

9. The protein according to one of claims 1 to 8, wherein it is a plant protein, notably 
from Arabidqpsis thaliana. 
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10. The protein according to one of claims 1 to 8, wlierein it is a bacterial protein, notably 
from £. colL 

1 1 . The protein according to one of claims 1 to 8, wherein it is a protozoal protein, 
notably from Plasmodium falciparum, 

12. Purified isolated nucleic acid encoding the protein according to one of claims 1 to 4 
and/or the protein according to one of claim 5 to 8 with or without introns. 

13. A DNA expression vector containing the sequence of the nucleic acid according to 
claim 12. 

14. Use of a protein according to one of claims 1 to 11 for screening a chemical library 
for an inhiJ?itor of the biosynthesis of isoprenoids. 

15. Cells, cell cultures, organisms or parts thereof recombinantly endowed with the 
sequence of the nucleic acid according to claim 12 or with the vector according to 
claim 13, wherein said cell is selected from the group consisting of bacterial, 
protozoal, fungal, plant, insect and mammalian cells. 

16. Cells, cell cultures, organisms or parts thereof according to claim 15, wherein it is 
recombinantly endowed with a vector containing a nucleic acid sequence encoding a 
protein according to one of claims 1 to 4 and/or a protein according to one of claims 
5 to 8, and wherein said cell is optionally further endowed with at least one gene 
selected from the following group: dxs. dxr, ispD (formerly ygbP)\ ispE (formerly 
ychB)\ ispF (formerly ygbB) of E. coli or a function-conservative homologue thereof, 
or a function-conservative fusion, deletion or insertion variant of any of the above 
genes. 

17. Cells, cell cultures, or organisms or parts thereof transformed or transfected for an 
increased rate of formation of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate, notably 
in its (£)-form, compared to cells, cell cultures, or organisms or parts thereof absent 
said transformation or transfection. 

F:\I84156\1 56ANM\WB00Q203 



wo 02/083720 



PCT/EP02/04005 



81 

18. Cells, cell cultures, or organisms or parts thereof transformed or transfected for an 
increased rate of conversion of (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate to 
isopentenyl diphosphate and/or dimethylallyl diphosphate compared to cells, cell 
cultures, or organisms or parts thereof absent said transformation or transfection. 

19. Cells, cell cultures, or organisms or parts thereof according to claim 15 transformed 
or transfected for an increased expression level of the protein of one of claims 1 to 4 
and/or the protein of one of claims 5 to 8 compared to cells, cell cultures, or 
organisms or parts thereof absent said transformation or transfection. 

20. Cells, cell cultures or organisms or parts thereof in accordance with claim 15 or 16, 
characterized by the recombinant endowment with sets of genes as follows: 

ispC (formerly dxr), /spD, ispE, ispF, ispG (formerly gcpE); or 

ispC, ispD, ispE, ispF, ispG, ispH ^formerly lytB)\ or 

cfxs, ispC, ispD, ispE, ispF, ispG\ or 

cfxs, ispC, ispD, ispE, ispF, ispG, ispH\ or 

dxs, ispC, ispG, or 

dxs, ispC, ispG, ispH 

of E. coli or a function-conservative homologue thereof and/or a function- 
conservative fusion, deletion or insertion variant of any of the above genes. 

21 . Cells, cell cultures or organisms or parts thereof in accordance with claim 20, 
characterized by further recombinant endowment(s) with gene(s) being functional for 
biosynthetic steps downstream from the C5 isoprenoids. 

22. Cells, cell cultures or organisms or parts thereof in accordance with one of claims 1 5 
to 21 , wherein at least one gene of said recombinant endowments is equipped with 
artificial ribosomal binding site(s) for expression of the corresponding gene 
product(s) at a rate enhanced compared to the rate in the absence of the artificial 
ribosomal binding site(s). 

23. Cells, cell cultures or organisms or parts thereof in accordance with one of claims 15 
to 22, wherein at least one of said recombinant endowments is due to a high copy 
replication vector. 
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24. Cells, cell cultures or organisms or parts thereof in accordance with one of claims 15 
to 23, wherein they are of bacterial, protozoal, fungal, plant or animal origin. 

25. Use of the cells, cell cultures or organisms, or parts thereof in accordance with one 
of claims 15 to 24 or disruption products thereof for the enhanced rate of in vivo 
formation or for the efficient in vitro production of an, optionally isotopically labelled, 
biosynthetic intermediate or product of the non-mevalonate isoprenoid biosynthetic 
pathway, optionally by feeding 1-deoxy-D-xylulose or glucose that may be 
isotopically labelled. 

26. Use according to claim 25, wherein said intermediate or product is a C5-isoprenoid 
intermediate compound; or a >C5-isoprenoid compound; or a terpenoid compound. 

27. Use according to one of claims 25 or 26, wherein the rate of formation or production 
is enhanced by providing a source for CTP. 

28. Use according to claim 27, wherein the source for CTP is cytidine and/or uridine 
and/or cytosine and/or uracil and/or ribose and and/or ribose 5-phosphate and/or any 
biosynthetic precursor of CTP. 

29. Use according to one of claims 25 to 28, wherein the rate of formation or production 
is enhanced by providing a source for phosphorylation enhancement. 

30. Use according to claim 29, wherein the source for phosphorylation enhancement is 
glycerol 3-phosphate and/or phosphoenolpyruvate and/or inorganic phosphate 
and/or inorganic pyrophosphate and/or any organic phosphate or pyrophosphate. 

31 . Use according to one of claims 25 to 30, wherein the rate of formation or production 
is enhanced by providing a source for reduction equivalents. 

32. Use according to claim 31, wherein the source for reduction equivalents is succinate 
and/or lipids and/or glucose and/or glycerol and/or lactate. 
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33. Optionally isotope-labelled compound of the following formula I or a salt thereof: 



whereby and are different from each other and one of R^ and R^ is hydrogen 
and the other is selected from the group consisting of -CH2-0-PO(OH)-0-PO(OH)2, 
.CH2-0-PO(OH)2, and -CHjOH, and whereby A stands for -CH2OH or -CHO. 

34. The optionally isotope-labelled compound according to claim 33, wherein A stands 
for -CH2OH. 

35. The optionally isotope-labelled compound according to claim 33, wherein A stands 
for -CHO. 

36. The optionally isotope-labelled compound according to one of claims 33 to 35, 
wherein R^ is H and R^ is selected from the group consisting of -CH2-0-P0{0H)-0- 
PO(OH)2 and -CH2-0-PO(OH)2. 

37. The optionally isotope-labelled compound according to one of claims 33 to 35, 
wherein R^ is H and R^ is selected from the group consisting of -CH2-0-P0(0H)-0- 
PO(OH)2 and -CH2-0-PO(OH)2. 

38. The optionally isotope-labelled compound according to one of claims 33 to 37, 
whereby said group consists of -CH2-0-PO(OH)-0-PO(OH)2. 

39. Optionally isotope-labelled 1-hydroxy-2-methyl-2-butenyl 4-diphosphate salt or a 
protonated form thereof. 

40. Optionally isotope-labelled (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate salt or a 
protonated form thereof. 




(I) 
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41 . Optionally isotope-labelled (Z)-1 -hydroxy-2-methy|.2-butenyl 4-diphosphate salt or a 
protonated form thereof. 

42. Use of a compound according to one of claims 33 to 41 , notably according to claim 
38, for screening for genes, enzymes or inhibitors of the biosynthesis of isoprenoids 
or terpenoids, either in vitro in the presence of an electron donor or in vivo, 

43. Use of a compound according to one of claims 33 to 41 . notably according to claim 
40, as an immunomodulatory agent. 

44. Use of a compound according to one of claims 33 to 41 . notably according to claim 
40. for activating yS T cells. 

45. Use of a compound according to one of claims 33 to 41 . notably according to claim 

40, for the preparation of a medicament. 

46. Pharmaceutical composition containing a compound according to one of claims 33 to 

41, notably according to claim 36 or 40, and a pharmaceutically acceptable carrier. 

47. The pharmaceutical composition according to claim 46 further comprising an 
antibiotically active compound. 

48. The pharmaceutical composition according to claim 47, wherein the antibiotically 
active compound is bacteriostatic. 

49. The pharmaceutical composition according to claim 47. wherein the antibiotically 
active compound inhibits bacterial protein synthesis. 

50. A method of treating a pathogen infection comprising administering a pharmaceutical 
composition according to one of claims 46 to 49. 

51 . Monoclonal or polyclonal antibody against a compound of one of claims 33 to 41 . 
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52. A method of detecting a pathogen, notably in a body fluid, by using the antibody of 
claim 51. 

53. Use of the cells, cell cultures or organisms or parts thereof in accordance with claims 
1 5 to 24 for the production of a protein in an enzymatically competent form for the 
conversion of 2C-methyl-D-erythritol 2.4-cyclodiphosphate into 1-hydroxy-2-methyl-2- 
butenyl 4-diphosphate, notably (E)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate. 

54. Use of the cells, cell cultures or organisms or parts thereof in accordance with claims 
15 to 24 for the production of a protein in an enzymatically competent form for the 
conversion of (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate into isopentenyl 
diphosphate and/or dimethylallyl diphosphate. 

55. Use of the cells, cell cultures or organisms or parts thereof in accordance with claims 
15 to 24 for the production of proteins in an enzymatically competent form for the 
conversion of 2C-methyl-D-erythritol 2,4-cyclodiphosphate into isopentenyl 
diphosphate and/or dimethylallyl diphosphate. 

56. A method of altering the expression level of the gene product(s) of ispG and/or ispH 
In cells comprising 

(a) transforming host cells with the ispG and/or ispH gene. 

(b) growing the transformed host cells of step (a) under conditions that are 
suitable for the efficient expression of ispG and/or ispH, resulting in 
production of altered levels of the ispG and/or ispH gene product(s) in the 
transformed cells relative to expression levels of untransformed cells. 

57. Method of identifying an inhibitior of an enzyme functional for the conversion of 2C- 
methyl-D-erythritol 2,4-cyclodiphosphate to 1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate, notably its E-form, of the non-mevalonate isoprenoid pathway by the 
following steps: 

(a) incubating a mixture containing said enzyme with its, optionally isotope- 
labeled, substrate 2C-methyl-D-erythritoI-2,4-cyclodiphosphate under 
conditions suitable for said conversion in the presence and in the absence of 
a potential inhibitor, 
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(b) subsequently determining the concentration of 2C-methyl-D-erythritol 2.4- 
cyclodiphosphate and/or 1-hydroxy-2-methyl-2-butenyl 4-diphosphate, and 

(c) comparing the concentration in the presence and in the absence of said 
potential inhibitor. 

58. Method of identifying an inhibitior of an enzyme functional for the conversion of 1- 
hydroxy-2-methyl-2-butenyl 4-diphosphate. notably its E-form. to isopentenyl 
diphosphate or dimethylallyl diphosphate of the non-mevalonate isoprenoid pathway 
by the following steps: 

(a) incubating a mixture containing said enzyme with its. optionally isotope- 
labeled, substrate 1-hydroxy-2-methyl-2-butenyl 4-diphosphate under 
conditions suitable for said conversion in the presence and in the absence of 
a potential inhibitor, 

(b) determining the concentration of 1-hydroxy-2-methyl-2-butenyl 4-diphosphate 
and/or isopentenyl diphosphate or dimethylallyl diphosphate, and 

(c) comparing the concentration in the presence and in the absence of said 
potential inhibitor. 

59. The method according to claim 58, wherein step (a) is carried out in the presence of 
FAD. 

60. The method of one of claims 57 to 59, which further comprises preparing cells 
recombinantly endowed with a gene coding for said enzyme, culturing said cells, 
preparing a crude extract of said cells, and using said crude extract in step (a). 

61 . The method according to one of claims 57 to 60, wherein said enzyme is a plant 
enzyme. 

62. The method according to one of claims 57 to 60, wherein said enzyme is. an enzyme 
of Plasmodium falciparum. 

63. The method according to one of claims 57 to 60, wherein said enzyme is a bacterial 
enzyme. 
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64. The method according to one of claims 57 to 60, wherein the incubation of step (a) is 
carried out in the presence of a sulfhydryl reductant e.g. dithiothreitol. 

65. The method according to one of claims 57 to 64. wherein the incubation in step (a) is 
carried out in the presence of a phosphatase inhibitor. 

66. The method according to claim 65, wherein the phosphatase inhibitor is an alkali 
fluoride. 

67. The method according to one of claims 57 to 66, wherein the incubation of step (a) is 
carried out in the presence of NADH or NADPH. 

68. The method according to one of claims 57 to 67, wherein the incubation in step (a) is 
carried out in the presence of an inhibitor of an enzyme acting downstream of 
isopentenyl diphosphate or dimethylallyl diphosphate. 



69. The method according to one of claims 57 to 68, wherein the incubation of step (a) is 
carried out in the presence of a salt selected from the group of Co^*, Mn^"*^. Fe 
Ni salts. 



70. The method according to one of claims 57 to 69. wherein step (b) is carried out by 
reversed phase ion-pair HPLC chromatography. 

71 . The method according to one of claims 57 to 69. wherein step (b) is carried out by 
determining the consumption of NADH or NADPH. 

72. The method according to one of claims 57 to 71, which is carried out on many 
potential inhibitors simultaneously or consecutively in a high-throughput screening. 

73. A process for the efficient in vivo synthesis of 1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate, notably (E)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate; or isopentenyl 
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diphosphate or dimethylallyl diphosphate; optionally isotope labelled, in salt form or 
in protonated form, by the following steps: 

(a) culturing cells, preferably bacterial cells, recombinantly endowed in accordance 
with one of claims 15 to 24 for said synthesis for a predetermined period of time 
at a predetermined temperature; 

(b) optionally adding glucose to a predetermined final concentration and further 
culturing for a predetermined period of time; 

(c) harvesting the cells; 

(d) preparing a crude extract from the harvested cells; 

(e) separating and purifying optionally isotope-labelled 1-hydroxy-2-methyl-2- 
butenyl 4-diphosphate, notably (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate; 
or isopentenyl diphosphate; or dimethylallyl diphosphate; in salt form or in 
protonated form, optionally by preparative chromatography. 

74. A process for screening chemical libraries for the presence or absence of inhibition 
of the biosynthesis of isoprenoids, notably by blocking the biosynthesis of the 
intermediates 1-hydroxy-2-methyl-2-butenyl 4-diphosphate, notably (£)-1 -hydroxy-2- 
methyl-2-butenyl 4-diphosphate and/or isopentenyl diphosphate and/or dimethylallyl 
diphosphate, said screening comprising: 

(a) culturing cells, preferably bacterial cells, recombinantly endowed in accordance 
with claim 15 for a predetermined period of time at a predetermined 
temperature; 

(b) optionally adding glucose to a predetermined final concentration and further 
culturing for a predetermined period of time; 

(c) harvesting the cells; 

(d) preparing a crude extract from the harvested cells; 

whereby steps (a) to (d) are carried out in the presence and in the absence of a 
prospective inhibitor; 

(e) detecting difference(s) in the level(s) of 1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate, notably (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate and/or 
isopentenyl diphosphate and/or dimethylallyl diphosphate; between the 
presence and absence of the prospective inhibitor; and 
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(f) correlating said detected difference(s) with tiie presence or absence of an 
above-defined inhibition. 

75. Cells, cell cultures or organisms or parts thereof for the efficient formation of a 
biosynthetic product or intermediate of the non-mevalonate pathway to isoprenoids 
or terpenoids, characterized by 

(a) first recombinant endowment with a gene functional for the biosynthesis of 1- 
deoxy-D-xylulose 5-phosphate from 1-deoxy-D-xylulose; 

(b) capability for the uptake of 1-deoxy-D-xylulose; and 

(c) recombinant endowment(s) with gene(s) being functional for the conversion of 
1-deoxy-D-xylulose 5-phosphate into desired downstream C5-intermediate(s) of 
said pathway. 

76. Cells, cell cultures or organisms or parts thereof in accordance with claim 75, 
wherein said gene(s) of said second recombinant endowment(s) code(s) for 
enzyme(s) for the formation of at least one of the following C5-intermediates of the 
non-mevalonate isoprenoid pathway: 

(a) 2C-methyl-D-erythritol 4-phosphate; 

(b) 4-diphosphocytidyl-2C-methyl-D-erythritol; 

(c) 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate; 

(d) 2C-methyl-D-erythritol 2,4-cyclodiphosphate; 

(e) 1-hydroxy-2-methyl-2-butenyl 4-diphosphate; 

(f) isopentenyl diphosphate; 

(g) dimethylallyl diphosphate. 

77. Cells, cell cultures or organisms or parts thereof in accordance with claim 75 or 76, 
characterized by the recombinant endowment with sets of genes as follows: 

(a) xy/8, dxn or 

(b) xy/S, dxn ispD (formerly ygbP)\ or 

(c) xy/S, dxn ispD, ispE (formerly ychB), or 

(d) xy/B, dxn ispD, ispE, ispF (formerly ygbB); or 

(e) xy/8, dxn /spD, ispE, ispF, ispG (formerly gcpE); or 

(f) xy/B. dxn /spD. ispE, ispF, ispG, ispH ^formerly lytB) 
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of E. coli or a function-conservative homologue thereof and/or a function- 
conservative fusion, deletion or insertion variant of any of the above genes. 



78. Cells, cell cultures or organisms or parts thereof in accordance with claim 75, 
characterized by the recombinant endowment with xylB and ispG (formerly gcpE) 
and optionally at least one gene selected from the following group: dxn ispD 
(formeriy ygbP)\ ispE (formerly ychB)] ispF (formeriy ygbB) of E. coli or a function- 
conservative homologue thereof, or a function-conservative fusion, deletion or 
insertion variant of any of the above genes. 

79. Cells, cell cultures or organisms or parts thereof in accordance with claim 75, 
characterized by the recombinant endowment with xylB and ispH (forrheriy lytB) and 
optionally at least one gene selected from the following group: dxn ispD (formerly 
ygbP)\ ispE (formeriy ychB)\ ispF (formeriy ygbB); ispG (formerly gcpE) of E. coli or a 
function-conservative homologue thereof, or a function-conservative fusion, deletion 
or insertion variant of any of the above genes. 



80. Cells, cell cultures or organisms or parts thereof in accordance with claim 75, 

characterized by the recombinant endowment with xy/, ispG, (formeriy gcpE) and 
ispH (formerly lytB) and optionally at least one gene selected from the following 
group: dxn ispD (formeriy ygbP)\ IspE (formeriy ychB)] ispF (formeriy ygbB); of E coli 
or a function-conservative homologue thereof, or a function-conservative fusion, 
deletion or insertion variant of any of the above genes. 



81 . A process for the efficient in vivo synthesis of 2C-methyl-D-erythritol 4-phosphate; or 
4-dlphosphocytidyl-2C-methyl-D-erythritol; or 4-diphosphocytidyl-2C-methyl-D- 
erythritol 2-phosphate; or 2C-methyl-D-erythritol 2,4-cyclodiphosphate; or 1-hydroxy- 
2-methyl-2-butenyl 4-diphosphate, notably (£)-1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate; or isopentenyl diphosphate or dimethylallyl diphosphate; optionally 
isotope labelled, in salt form or in protonated form, by the following steps: 
(a) culturing cells, preferably bacterial cells, recombinantly endowed in 

accordance with one of claims 75 to 80 for said synthesis for a predetermined 

period of time at a predetermined temperature; 
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(b) adding 1 -deoxy-D-xylulose to a predetermined final concentration and further 
culturing for a predetermined period of time; 

(c) harvesting the cells; 

(d) preparing a crude extract from the harvested cells; 

(e) separating and purifying optionally isotope-labelled 2C-methyl-D-erythritol 4- 
phosphate; or4-diphosphocytidyl-2C-methyl-D-erythritol; or 4- 
diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate; or 2C-methyl-D- 
erythritol 2,4-cyclodiphosphate; or 1-hydroxy-2-methyl-2-butenyl 4- 
diphosphate, notably (£)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate; or 
isopentenyl diphosphate; or dimethylailyl diphosphate; in salt form or in 
protonated form, by preparative chromatography. 

82. The process according to one of claims 73, 74 or 81 . wherein step (a) is carried out 
in terrific broth medium. 

83. The process according to one of claims 73, 74, 81 or 82, wherein a source for CTP. 
preferably cytidine or uridine, is added in step (a). 

84. The process according to one of claims 73, 74 or 81 to 83. wherein a source of 
phosphorylation activity, preferably glycerol 3-phosphate and/or inorganic phosphate, 
is added in step (a). 

85. The process according to one of claims 73, 74 or 81 to 84 wherein a source of 
reduction equivalents, preferably succinate and/or lipids and/or glucose and/or 
glycerol and/or lactate is added in step (a). 



86. A process for screening chemical libraries for the presence or absence of inhibition of 
the biosynthesis of isoprenoids, notably by blocking the biosynthesis of the 
intermediates 2C-methyl-D-erythritol 4-phosphate and/or 4-diphosphocytidyl-2C- 
methyl-D-erythritol and/or 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate 
and/or 2C-methyl-D-erythritol 2.4-cyclodiphosphate and/or 1-hydroxy-2-methyl-2- 
butenyl 4-diphosphate, notably (E)-1-hydroxy-2-methyl-2-butenyl 4-diphosphate 
and/or isopentenyl diphosphate and/or dimethylailyl diphosphate, said screening 
comprising: (i) carrying out the steps (a) to (d) of claim 81, preferably in combination 
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with one of claims 82 to 85. in the presence and absence of a prospective inhibitor; 
(ii) detecting difference(s) in the level(s) of 1-deoxy-D-xylulose and/or 1-deoxy-D- 
xylulose 5-phosphate and/or 2C-methyl-D-erythritol 4-phosphate and/or 4- 
diphosphocytidyl-2C-methyl-D-erythritol and/or 4-diphosphocytidyl-2C-methyl-D- 
erythritol 2-phosphate and/or 2C-methyl-D-erythritol 2,4-cyclodiphosphate and/or 1- 
hydroxy-2-niethyl-2-butenyl 4-diphosphate. notably (£)-1-hydroxy-i2-methyl-2-butenyl 
4-diphosphate and/or isopentenyl diphosphate and/or dimethylallyl diphosphate; 
between the presence and absence of a prospective inhibitor and (iii) correlating said 
detected difference(s) with the presence or absence of an above-defined inhibition. 

87. The process according to claim 74 or 86, wherein said detecting of step (ii) is done 
by HPLC and/or NMR spectroscopy. 

88. Vektor comprising a sequence coding for one of the recombinant endowments as 
defined in one of claims 75 to 80. 

89. A process for the chemical preparation of a compound of formula I or a salt thereof: 



wherein A represents -CH2OH and and are different from each other and one 
of R' and R^ is hydrogen and the other is -CH2-0-PO(OH)-0-PO(OH)2, -CH2-O- 
PO(OH)2 or -CH2-OH by the following steps: 
(a) converting a compound of the following formula (II): 




R2 



R1 



(I) 




(II) 



O 



wherein B is a protective group into a compound of the following formula (III) or 
(IV): 
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O-B 



X 



O-B 



(III) 



(IV) 



by a Wittig or Horner reagent, wherein the group D is a precursor group 
convertible reductively to a -CHj-OH group; 

(b) reductively converting group D to a -CHg-OH group; 

(c) optionally converting group -CH2-OH obtained in step (b) into -CH2-0-P0(0H)- 
0-P0(0H)2 or -CH2-0-PO(OH)2 or salts thereof in a manner knwon per se; 

(d) optionally conversion to a desired salt; 

(e) removing the protective group B. 

90. The process according to claim 89. wherein said protective group B forms an acetal 
together with the remaining moiety of the compound of formula (II). 

91 . The process according to claim 89 or 90, wherein said protective group B is a 2- 
tetrahydropyranyl group. 

92. The process according to one of claims 89 to 91, wherein group D is an 
alkoxycarbonyl group. 

93. The process according to one of claims 89 to 92, wherein said reduction of step (b) is 
performed with a metal hydride, notably an aluminium hydride or a boron hydride. 

94. The process according to one of claims 89 to 93, wherein step (c) comprises 
converting said -CH2-OH group to a -CHj-halide group. 

95. The process according to one of claims 89 to 94. wherein step (c) comprises reacting 
said -CH2-OH group with a sulfonic acid halogenide. notably tosyl chloride. 

96. The process of one of claims 89 to 95, wherein step (c) comprises a reaction with 
phosphoric acid or diphosphoric acid or a salt thereof. 
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97. The process of one of claims 89 to 96. wherein steps (a) to (c) are carried out in aprotic 
solvents. 

98. The process of one of claims 89 to 97, wherein step (e) is carried out by acid 
hydrolysis. 

99. A process for the chemical preparation of a compound of formula I or a salt thereof: 



wherein A represents -CHjOH or -CHO, is hydrogen, and is -CH2-0-P0(0H)-0- 
PO(OH)2. -CH2-0-PO(OH)2 or -CHj-OH by the following steps: 

(a) converting 2-methyl-2-vinyl-oxiran into 4-chloro-2-methyl-2-buten-1-al; 

(b) converting 4-chloro-2-methyl-2-buten-1-al to its acetal; 

(c) substituting the chlorine atom in the product of step (b) by a hydroxyl group, a 
phosphate group or a pyrophosphate group; 

(d) hydrolysing the acetal obtained in step (c) to produce an aldehyde group; 

(e) optionally converting the aldehyde group of the product of step (d) to a -CH2OH 



100. The process of claim 99, wherein step (a) is carried out in the presence of CUCI2. 

101 . The process of claim 99 or 100. wherein step (b) is carried out in the presence of an 
ortho alkyi ester of formic acid. 

102. The process of one of claims 99 to 101 . wherein R^ is -CH2"-0-PO(OH)-0-PO(OH)2 or - 
CH2-0-PO(OH)2 and step (c) is carried out by reacting the product of step (b) with a 
tetra*alkylammonium pyrophosphate or a tetra-alkylammonium phosphate, 
respectively, In a polar aprotic solvent 




(I) 



group. 



103. The process of one of claims 99 to 102, wherein step (e) is performed with an alkali 
metal borohydride in aqueous solution. 
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Annex A 

DNA sequence of the vector construct pBSxylBdxr 

ID PBSXYLBDXR PRELIMINARY; DNA; 5628 BP. 



SEQUENCE 


5628 BP; 


1378 A; 1374 C; 1552 G; 1324 T; 0 OTHER; 


GTGGCACTTT 


TCGGGGAAAT 


GTGCGCGGAA 


CCCCTATTTG 


TTTATTTTTC 


TAAATACATT 


CAAATATGTA 


TCCGCTCATG 


AGACAATAAC 


CCTGATAAAT 


GCTTCAATAA 


TATTGAAAAA 


GGAAGAGTAT 


GAGTATTCAA 


CATTTCCGTG 


TCGCCCTTAT 


TCCCTTTTTT 


GCGGCATTTT 


GCCTTCCTGT 


TTTTGCTCAC 


CCAGAAACGC 


TGGTGAAAGT 


AAAAGATGCT 


GAAGATCAGT 


TGGGTGCACG 


AGTGGGTTAC 


ATCGAACTGG 


ATCTCAACAG 


CGGTAAGATC 


CTTGAGAGTT 


TTCGCCCCGA 


AGAACGTTTT 


CCAATGATGA 


GCACTTTTAA 


AGTTCTGCTA 


TGTGGCGCGG 


TATTATCCCG 


TATTGACGCC 


GGGCAAGAGC 


AACTCGGTCG 


CCGCATACAC 


TATTCTCAGA 


A.TGACTTGGT TGAGTACTCA 


CCAGTCACAG 


AAAAGCATCT 


TACGGATGGC 


ATGACAGTAA 


GAGAATTATG 


CAGTGCTGCC 


ATAACCATGA 


GTGATAACAC 


TGCGGCCAAC 


TTACTTCTGA 




AGG AC CGAAG 


GAGCTAACCG 


CTTTTTTGCA 


CAACATGGGG 


GATCATGTAA 




TCGTTGGGAA 


CCGGAGCTGA 


ATGAAGCCAT 


ACCAAACGAC 


GAGCGTGACA 


C C ACGATGCC 


TGTAGCAATG 


GCAACAACGT 


TGCGCAAACT 


ATTAACTGGC 


GAACTACTTA 




CCGiSCAACAA 


TTAATAGACT 


GGATGGAGGC 


GGATAAAGTT 


GCAGGACCAC 


X X V» X wVm V9^» X \. 


GGPCPTTCCG 

V9W^V»^ X X w 


GCTGGCTGGT 


TTATTGCTGA 


TAAATCTGGA 


GCCGGTGAGC 


GTGGGTCTCG 


CGGTATCATT 


GCAGCACTGG 


GGCCAGATGG 


TAAGCCCTCC 


CGTATCGTAG 


TTATCTACAC 


GACGGGGAGT 


CAGGCAACTA 


TGGATGAACG 


AAATAGACAG 


ATCGCTGAGA 


TAGGTGCCTC 


ACTGATTAAG 


CATTGGTAAC 


TGTCAGACCA 


AGTTTACTCA 


TATATACTTT 


AGATTGATTT 


AAAACTTCAT 


TTTTAAT TT A 


AAAGGATCTA 


GGTGAAGATC 


CTTTTTGATA 


ATCTCATGAC 


CAAAATCCCT 


TAACGTGAGT 


TTTCGTTCCA 


CTGAGCGTCA 


GACCCCGTAG 


AAAAGATCAA 


AGGATCTTCT 


TGAGATCCTT 


TTTTTCTGCG 


CGTAATCTGC 


TGCTTGCAAA 


CAAAAAAACC 


ACCGCTACCA 


GCGGTGGTTT 


GTTTGCCGGA 


TCAAGAGCTA 


CCAACTCTTT 


TTCCGAAGGT 


AACTGGCTTC 


AGCAGAGCGC 


AGATACCA7VA 


TACTGTCCTT 


CTAGTGTAGC 


CGTAGTTAGG 


CCACCACTTC 


AAGAACTCTG 


TAGCACCGCC 


rriTv /^7\ rp7\ o/""PO 
TACAl ACL. X \^ 


V" P^S^"l'7i TV 
VjL^ J. v.. 1 OV_ 1 J\n 


TCCTGTTACC 


AGTGGCTGCT 


GCCAGTGGCG 


ATAAGTCGTG 


TCTTACCGGG 


TTGGACTCAA 


GACGATAGTT 


ACCGGATAAG 


GCGCAGCGGT 


CGGGCTGAAC 


GGGGGGTTCG 


TGCACACAGC 


CCAGCTTGGA 


GCGAACGACC 


TACACCGAAC 


TGAGATACCT 


ACAGCGTGAG 


CTATGAGAAA 


GCGCCACGCT 


TCCCGAAGGG 


AGAAAGGCGG 


ACAGGTATCC 


GGTAAGCGGC 


AGGGTCGGAA 


CAGGAGAGCG 


CACGA6GGAG 


CTTCCAGGGG 


GAAACGCCTG 


GTATCTTTAT 


AGTCCTGTCG 


GGTTTCGCCA 


CCTCTGACTT 


GAGCGTCGAT 


TTTTGTGATG 


CTCGTCAGGG 


GGGCGGAGCC 


TATGGAAAAA 


CGCCAGCAAC 


GCGGCCTTTT 


TACGGTTCCT 


GGCCTTTTGC 


TGGCCTTTTG 


CTCACATGTT 


CTTTCCTGCG 


TTATCCCCTG 


ATTCTGTGGA 


TAACCGTATT 


ACGGCCTTTG 


AGTGAGCTGA 


TACCGCTCGC 


CGCAGCCGAA 


CGACCGAGCG 


CAGCGAGTCA 


GTGAGCGAGG 
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AAGCGGAAGA GCGCCCAATA CGCAAACCGC 
GCAGCTGGCA CGACAGGTTT CCCGACTGGA 
TGAGTTAGCT CACTCATTAG GCACCCCAGG 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 
CCAAGCGCGC AATTAACCCT CACTAAAGGG 
GCCGCTCTAG AACTAGTGGA TCCCCCGGGC 
ATATCGGGAT AGATCTTGGC ACCTCGGGCG 
AGGTGGTTGC TGCGCAAACG GAAAAGCTGA 
AACAAGACCC GGAACAGTGG TGGCAGGCAA 
AGCATTCTCT GCAGGACGTT AAAGCATTGG 
TGCTGGATGC TCAGCAACGG GTGTTACGCC 
CGCAAGAGTG CACTTTGCTG GAAGCGCGAG 
TGATGATGCC CGGATTTACT GCGCCTAAAT 
TATTCCGTCA AATCGACAAA GTATTATTAC 
GGGAGTTTGC CAGCGATATG TCTGACGCAG 
GTGACTGGAG TGACGTCATG CTGCAGGCTT 
TATACGAAGG CAGCGAAATT ACTGGTGCTT 
TGGCGACGGT GCCAGTTGTC GCAGGCGGTG 
GAATGGTTGA TGCTAATCAG GCAATGTTAT 
TCAGCGAAGG GTTCTTAAGC AAGCCAGAAA 
CGCAACGTTG GCATTTAATG TCTGTGATGC 
CGAAATTAAC CGGCCTGAGC AATGTCCCAG 
AAAGTGCCGA GCCAGTTTGG TTTCTGCCTT 
ATCCCCAGGC GAAGGGGGTT TTCTTTGGTT 
CGCGAGCAGT GCTGGAAGGC GTGGGTTATG 
CCTGCGGTAT TAAACCGCAA AGTGTTACGT 
GGCGTCAGAT GCTGGCGGAT ATCAGCGGTC 
TGGGGCCAGC ACTGGGCGCA GCAAGGCTGG 
TCATTGAATT GTTGCCGCAA CTACCGTTAG 
ATGCCGCTTA TCAGCCACGA CGAGAAACGT 
TAATGGCGTA AAAGCTTGAG GAGAAATTAA 
CCGGCTCGAT TGGTTGCAGC ACGCTGGACG 
TAGTTGCGCT GGTGGCAGGC AAAAATGTCA 
CTCCCCGCTA TGCCGTAATG GACGATGAAG 
AGCAACAGGG .TAGCCGCACC GAAGTCTTAA 
CGCTTGAGGA TGTTGATCAG GTGATGGCAG 



CTCTCCCCGC GCGTTGGCCG ATTCATTAAT 
AAGCGGGCAG TGAGCGCAAC GCAATTAATG 
CTTTACACTT TATGCTTCCG GCTCGTATGT 
ACACAGGAAA CAGCTATGAC CATGATTACG 
AACAAAAGCT GGAGCTCCAC CGCGGTGGCG 
TGCAGGAATT CGAGGAGAAA TTAACCATGT 
TAAAAGTTAT TTTGCTCAAC GAGCAGGGTG 
CCGTTTCGCG CCCGCATCCA CTCTGGTCGG 
CTGATCGCGC AATGAAAGCT CTGGGCGATC 
GTATTGCCGG CCAGATGCAC GGAGCAACCT 
CTGCCATTTT GTGGAACGAC GGGCGCTGTG 
TTCCGCAATC GCGGGTGATT ACCGGCAACC 
TGCTATGGGT TCAGCGGCAT GAGCCGGAGA 
CGAAAGATTA CTTGCGTCTG CGTATGACGG 
CTGGCACCAT GTGGCTGGAT GTCGCAAAGC 
GCGACTTATC TCGTGACCAG ATGCCCGCAT 
TGTTACCTGA AGTTGCGAAA GCGTGGGGTA 
GCGACAATGC AGCTGGTGCA GTTGGTGTGG 
CGCTGGGGAC GTCGGGGGTC TATTTTGCTG 
GCGCCGTACA TAGCTTTTGC CATGCGCTAC 
TGAGTGCT^GC GTCGTGTCTG GATTGGGCCG 
CTTTAATCGC TGCAGCTCAA CAGGCTGATG 
ATCTTTCCGG CGAGCGTACG CCACACAATA 
TGACTCATCA ACATGGCCCC AATGAACTGG 
CGCTGGCAGA TGGCATGGAT GTCGTGCATG 
TGATTGGGGG CGGGGCGCGT AGTGAGTACT 
AGCAGCTCGA TTACCGTACG GGGGGGGATG 
CGCAGATCGC GGCGAATCCA GAGAAATCGC 
AACAGTCGCA TCTACCAGAT GCGCAGCGTT 
TCCGTCGCCT CTATCAGCAA CTTCTGCCAT 
CCATGAAGCA ACTCACCATT CTGGGCTCGA 
TGGTGCGCCA TAATCCCGAA CACTTCCGCG 
CTCGCATGGT AGAACAGTGC CTGGAATTCT 
CGAGTGCGAA ACTTCTTAAA ACGATGCTAC 
GTGGGCAACA AGCCGCTTGC GATATGGCAG 
CCATTGTTGG CGCTGCTGGG CTGTTACCTA 
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CGCTTGCTGC GATCCGCGCG GGTAAAACCA 
CCTGCGGACG TCTGTTTATG GACGCCGTAA 
ATAGCGAACA TAACGCCATT TTTCAGAGTT 
ACGCTGACCT TGAGCAAAAT GGCGTGGTGT 
TCCGTGAGAC GCCATTGCGC GATTTGGCAA 
CGAACTGGTC GATGGGGCGT AAAATTTCTG 
TGGAATACAT TGAAGCGCGT TGGCTGTTTA 
TTCACCCGCA GTCAGTGATT CACTCAATGG 
AGCTGGGGGA ACCGGATATG CGTACGCCAA 
TGAACTCTGG CGTGAAGCCG CTCGATTTTT 
CGGATTATGA TCGTTATCCA TGCCTGAAAC 
CAGCGACGAC AGCATTGAAT GCCGCAAACG 
AAATCCGCTT TACGGATATC GCTGCGTTGA 
GCGAACCACA ATGTGTGGAC GATGTGTTAT 
GAAAAGAGGT GATGCGTCTC GCAAGCTGAG 
TCGCCCTATA GTGAGTCGTA TTACGCGCGC 
TGGGAAAACC CTGGCGTTAC CCAACTTAAT 
TGGCGTAATA GCGAAGAGGC CCGCACCGAT 
GGCGAATGGA AATTGTAAGC GTTAATATTT 
TCAGCTCATT TTTTAACCAA TAGGCCGAAA 
AGACCGAGAT AGGGTTGAGT GTTGTTCCAG 
TGGACTCCAA CGTCAAAGGG CGAAAAACCG 
CATCACCCTA ATCAAGTTTT TTGGGGTCGA 
AAGGGAGCCC CCGATTTAGA GCTTGACGGG 
GGAAGAAAGC GAAAGGAGCG GGCGCTAGGG 
TAACCACCAC ACCCGCCGCG CTTAATGCGC 



TTTTGCTGGC CAATAAAGAA TCACTGGTTA 
AGGAGAGCAA AGCGCAATTG TTACCGGTCG 
TACCGCAACC TATCCAGCAT AATCTGGGAT 
CCATTTTACT TACCGGGTCT GGTGGCCCTT 
CAATGACGCC GGATCAAGCC TGCCGTCATC 
TCGATTCGGC TACCATGATG AACAAAGGTC 
ACGCCAGCGC CAGCCAGATG GAAGTGCTGA 
TGCGCTATCA GGACGGCAGT GTTCTGGCGC 
TTGCCCACAC CATGGCATGG CCGAATCGCG 
GCAAACTAA6 TGCGTTGACA TTTGCCGCAC 
TGGCGATGGA GGCGTTCGAA CAAGGCCAGG 
A7VATCACCGT TGCTGCTTTT CTTGCGCAAC 
ATTTATCCGT ACTGGAT^AAA ATGGATATGC 
CTGTTGATGC GAACGCGCGT GAAGTCGCCA 
TCGACCTCGA GGGGGGGCCC GGTACCCAAT 
TCACTGGCCG TCGTTTTACA ACGTCGTGAC 
CGCCTTGCAG CACATCCCCC TTTCGCCAGC 
CGCCCTTCCC AACAGTTGCG CAGCCTGAAT 
TGTTAAAATT CGCGTTAAAT TTTTGTTAAA 
TCGGCAAAAT CCCTTATAAA TCAAAAGAAT 
TTTGGAACAA GAGTCCACTA TTAAAGAACG 
TCTATCAGGG CGATGGCCCA CTACGTGAAC 
GGTGCCGTAA AGCACTAAAT CGGAACCCTA 
GAAAGCCGGC GAACGTGGCG AGAAAGGAAG 
CGCTGGCAAG TGTAGCGGTC ACGCTGCGCG 
CGCTACAGGG CGCGTCAG 
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Annex B 

DNA sequence of the vector construct pBSxylBdxrispD 

ID PBSXYLBDXRISPD PRELIMINARY; DNA; 6354 BP. 



SEQUENCE 


6354 BP; 


1539 A; 1573 C; 1753 G; 1489 T; 


GTGGCACTTT 


TCGGGGAAAT 


GTGCGCGGAA 


CCCCTATTTG 


TTTATTTTTC 


TAAATACATT 


CAAATATGTA 


TCCGCTCATG 


AGACAATAAC 


CCTGATAAAT 


GCTTCAATAA 


TATTGAAAAA 


GGAAGAGTAT 


GAGTATTCAA 


CATTTCCGTG 


TCGCCCTTAT 


TCCCTTTTTT 


GCGGCATTTT 


GCCTTCCTGT 


TTTTGCTCAC 


CCAGAAACGC 


TGGTGAAAGT 


AAAAGATGCT 


GAAGATCAGT 


TGGGTGCACG 


AGTGGGTTAC 


ATCGAACTGG 


ATCTCAACAG 


CGGTAAGATC 


CTTGAGAGTT 


TTCGCCCCGA 


AGAACGTTTT 


CCAATGATGA 


GCACTTTTAA 


AGTTCTGCTA 


TGTGGCGCGG 


TATTATCCCG 


TATTGACGCC 


GGGCAAGAGC 


AACTCGGTCG 


CCGCATACAC 


TATTCTCAGA 


ATGACTTGGT 


TGAGTACTCA 


CCAGTCACAG 


AAAAGCATCT 


TACGGATGGC 


ATGACAGTAA 


GAGAATTATG 


CAGTGCTGCC 


ATAACCATGA 


GTGATAACAC 


TGCGGCCAAC 


TTACTTCTGA 


CAACGATCGG 


AGGACCGAAG 


GAGCTAACCG 


CTTTTTTGCA 


CAACATGGGG 


GATCATGTAA 


CTCGCCTTGA 


TCGTTGGGAA 


CCGGAGCTGA 


ATGT^GCCAT 


ACCAAACGAC 


GAGCGTGACA 


CCACGATGCC 


TGTAGCAATG 


GCAACAACGT 


TGCGCAAACT 


ATTAACTGGC 


GAACTACTTA 


CTCTAGCTTC 


CCGGCAACAA 


TTAATAGACT 


GGATGGAGGC 


GGATAAAGTT 


GCAGGACCAC 


TTCTGCGCTC 


GGCCCTTCCG 


GCTGGCTGGT 


TTATTGCTGA 


TAAATCTGGA 


GCCGGTGAGC 


GTGGGTCTCG 


CGGTATCATT 


GCAGCACTGG 


GGCCAGATGG 


TAAGCCCTCC 


CGTATCGTAG 


TTATCTACAC 


GACGGGGAGT 


CAGGCAACTA 


TGGATGAACG 


AAATAGACAG 


ATCGCTGAGA 


TAGGTGCCTC 


ACTGATTAAG 


CATTGGTAAC 


TGTCAGACCA 


AGTTTACTCA 


TATATACTTT 


AGATTGATTT 


AAAACTTCAT 


TTTTAATTTA 


AAAGGATCTA 


GGTGAAGATC 


CTTTTTGATA 


ATCTCATGAC 


CAAAATCCCT 


TAACGTGAGT 


TTTCGTTCCA 


CTGAGCGTCA 


GACCCCGTAG 


AAAAGATCAA 


AGGATCTTCT 


TGAGATCCTT 


TTTTTCTGCG 


CGTAATCTGC 


TGCTTGCAAA 


CAAAAAAACC 


ACCGCTACCA 


GCGGTGGTTT 


GTTTGCCGGA 


TCAAGAGCTA 


CCAACTCTTT 


TTCCGAAGGT 


AACTGGCTTC 


AGCAGAGCGC 


AGATACCAAA 


TACTGTCCTT 


CTAGTGTAGC 


CGTAGTTAGG 


CCACCACTTC 


AAGAACTCTG 


TAGCACCGCC 


TACATACCTC 


GCTCTGCTAA 


TCCTGTTACC 


AGTGGCTGCT 


GCCAGTGGCG 


ATAAGTCGTG 


TCTTACCGGG 


TTGGACTCAA 


GACGATAGTT 


ACCGGATAAG 


GCGCAGCGGT 


CGGGCTGAAC 


GGGGGGTTCG 


TGCACACAGC 


CCAGCTTGGA 


GCGAACGACC 


TACACCGAAC 


TGAGATACCT 


ACAGCGTGAG 


CTATGAGAAA 


GCGCCACGCT 


TCCCGAAGGG 


AGA7VAGGCGG 


ACAGGTATCC 


GGTAAGCGGC 


AGGGTCGGAA 


CAGGAGAGCG 


CACGAGGGAG 


CTTCCAGGGG 


GAAACGCCTG 


GTATCTTTAT 


AGTCCTGTCG 


GGTTTCGCCA 


CCTCTGACTT 


GAGCGTCGAT 


TTTTGTGATG 


CTCGTCAGGG 


GGGCGGAGCC 


TATGGAAAAA 


CGCCAGCAAC 


GCGGCCTTTT 


TACGGTTCCT 


GGCCTTTTGC 


TGGCCTTTTG 


CTCACATGTT 


• CTTTCCTGCG 


TTATCCCCTG 


ATTCTGTGGA 


TAACCGTATT 


ACCGCCTTTG 


AGTGAGCTGA 


TACCGCTCGC 


CGCAGCCGAA 


C6ACCGAGCG 


CAGCGAGTCA 


GTGAGCGAGG 
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AAGCGGAAGA GCGCCCAATA CGCAAACCGC 
GCAGCTGGCA CGACAGGTTT CCCGACTGGA 
TGAGTTAGCT CACTCATTAG GCACCCCAGG 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 
CCAAGCGCGC AATTAACCCT CACTAAAGGG 
GCCGCTCTAG AACTAGTGGA TCCCCCGGGC 
ATATCGGGAT AGATCTTGGC ACCTCGGGCG 
AGGTGGTTGC TGCGCAAACG GAAAAGCTGA 
AACAAGACCC GGAACAGTGG TGGCAGGCAA 
AGCATTCTCT GCAGGACGTT AAAGCATTGG 
TGCTGGATGC TCAGCAACGG GtGTTACGCC 
CGCAAGAGTG CACTTTGCTG GAAGCGCGAG 
TGATGATGCC CGGATTTACT GCGCCTAAAT 
TATTCCGTCA AATCGACAAA GTATTATTAC 
GGGAGTTTGC CAGCGATATG TCTGACGCAG 
GTGACTGGAG TGACGTCATG CTGCAGGCTT 
TATACGAAGG CAGCGAAATT ACTGGTGCTT 
TGGCGACGGT GCCAGTTGTC GCAGGCGGTG 
GAATGGTTGA TGCTAATCAG GCAATGTTAT 
TCAGCGAAGG GTTCTTAAGC AAGCCAGAAA 
CGCAACGTTG GCATTTAATG TCTGTGATGC 
CGAAATTAAC CGGCCTGAGC AATGTCCCAG 
AAAGTGCCGA GCCAGTTTGG TTTCTGCCTT 
ATCCCCAGGC GAAGGGGGTT TTCTTTGGTT 
CGCGAGCAGT GCTGGAAGGC GTGGGTTATG 
CCTGCGGTAT TAAACCGCAA AGTGTTACGT 
GGCGTCAGAT GCTGGCGGAT ATCAGCGGTC 
TGGGGCCAGC ACTGGGCGCA GCAAGGCTGG 
TCATTGAATT GTTGCCGCAA CTACCGTTAG 
ATGCCGCTTA TCAGCCACGA CGAGAAACGT 
TAATGGCGTA AAAGCTTGAG GAGAAATTAA 
CCGGCTCGAT TGGTTGCAGC ACGCTGGACG 
TAGTTGCGCT GGTGGCAGGC AAAAATGTCA 
CTCCCCGCTA TGCCGTAATG GACGATGAAG 
AGCAACAGGG TAGCCGCACC GAAGTCTTAA 
CGCTTGAGGA TGTTGATCAG GTGATGGCAG 
CGCTTGCTGC GATCCGCGCG GGTAAAACCA 



CTCTCCCCGC GCGTTGGCCG ATTCATTAAT 
AAGCGGGCAG TGAGCGCAAC GCAATTAATG 
CTTTACACTT TATGCTTCCG GCTCGTATGT 
ACACAGGAAA CAGCTATGAC CATGATTACG 
AACAAAAGCT GGAGCTCCAC CGCGGTGGCG 
TGCAGGAATT CGAGGAGT^AA TTAACCATGT 
TAAAAGTTAT TTTGCTCAAC GAGCAGGGTG 
CCGTTTCGCG CCCGCATCCA CTCTGGTCGG 
CTGATCGCGC AATGAAAGCT CTGGGCGATC 
GTATTGCCGG CCAGATGCAC GGAGCAACCT 
CTGCCATTTT GTGGAACGAC GGGCGCTGTG 
TTCCGCAATC GCGGGTGATT ACCGGCAACC 
TGCTATGGGT TCAGCGGCAT GAGCCGGAGA 
CGAAAGATTA CTTGCGTCTG CGTATGACGG 
CTGGCACCAT GTGGCTGGAT GTCGCAAAGC 
GCGACTTATC TCGTGACCAG ATGCCCGCAT 
TGTTACCTGA AGTTGCGAAA GCGTGGGGTA 
GCGACAATGC AGCTGGTGCA GTTGGTGTGG 
CGCTGGGGAC GTCGGGGGTC TATTTTGCTG 
GCGCCGTACA TAGCTTTTGC CATGCGCTAC 
TGAGTGCAGC GTCGTGTCTG GATTGGGCCG 
CTTTAATCGC TGCAGCTCAA CAGGCTGATG 
ATCTTTCCGG CGAGCGTACG CCACACAATA 
TGACTCATCA ACATGGCCCC AATGAACTGG 
CGCTGGCAGA TGGCATGGAT GTCGTGCATG 
TGATTGGGGG CGGGGCGCGT AGTGAGTACT 
AGCAGCTCGA TTACCGTACG GGGGGGGATG 
CGCAGATCGC GGCGAATCCA GAGAAATCGC 
AACAGTCGCA TCTACCAGAT GCGCAGCGTT 
TCCGTCGCCT CTATCAGCAA CTTCTGCCAT 
CCATGAAGCA ACTCACCATT CTGGGCTCGA 
TGGTGCGCCA TAATCCCGAA CACTTCCGCG 
CTCGCATGGT AGAACAGTGC CTGGAATTCT 
CGAGTGCGAA ACTTCTTAAA ACGATGCTAC 
GTGGGCAACA AGCCGCTTGC GATATGGCAG 
CCATTGTTGG CGCTGCTGGG CTGTTACCTA 
TTTTGCTGGC CAATAAAGAA TCACTGGTTA 
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CCTGCGGACG TCTGTTTATG GACGCCGTAA AGCAGAGCAA AGCGCAATTG TTACCGGTCG 
ATAGCGAACA TAACGCCATT TTTCAGAGTT TACCGCAACC TATCCAGCAT AATCTGGGAT 
ACGCTGACCT TGAGCAAAAT GGCGTGGTGT CCATTTTACT TACCGGGTCT GGTGGCCCTT 
TCCGTGAGAC GCCATTGCGC GATTTGGCAA CAATGACGCC GGATCAAGCC TGCCGTCATC 
CGAACTGGTC GATGGGGCGT AAAATTTCTG TCGATTCGGC TACCATGATG AACAAAGGTC 
TGGAATACAT TGAAGCGCGT TGGCTGTTTA ACGCCAGCGC CAGCCAGATG GAAGTGCTGA 
TTCACCCGCA GTCAGTGATT CACTCAATGG TGCGCTATCA GGACGGCAGT GTTCTGGCGC 
AGCTGGGGGA ACCGGATATG CGTACGCCAA TTGCCCACAC CATGGCATGG CCGAATCGCG 
TGAACTCTGG CGTGAAGCCG CTCGATTTTT GCAAACTT^G TGCGTTGACA TTTGCCGCAC 
CGGATTATGA TCGTTATCCA TGCCTGAAAC TGGCGATGGA GGCGTTCGAA CT^GGCCAGG 
CAGCGACGAC AGCATTGAAT GCCGCAAACG AAATCACCGT TGCTGCTTTT CTTGCGCAAC 
AAATCCGCTT TACGGATATC GCTGCGTTGA ATTTATCCGT ACTGGAAAAA ATGGATATGC 
GCGAACCACA ATGTGTGGAC GATGTGTTAT CTGTTGATGC GAACGCGCGT GAAGTCGCCA 
GAAAAGAGGT GATGCGTCTC GCAAGCTGAG TCGACGAGGA GAAATTAACC ATGGCAACCA 
CTCATTTGGA TGTTTGCGCC GTGGTTCCGG CGGCCGGATT TGGCCGTCGA ATGCAAACGG 
AATGTCCTAA GCAATATCTC TCAATCGGTA ATCAAACCAT TCTTGAACAC TCGGTGCATG 
CGCTGCTGGC GCATCCCCGG GTGAAACGTG TCGTCATTGC CATAAGTCCT GGCGATAGCC 
GTTTTGCACA ACTTCCTCTG GCGAATCATC CGCAAATCAC CGTTGTAGAT GGCGGTGATG 
AGCGTGCCGA TTCCGTGCTG GCAGGTCTGA AAGCCGCTGG CGACGCGCAG TGGGTATTGG 
TGCATGACGC CGCTCGTCCT TGTTT6CATC AGGATGACCT CGCGCGATTG TTGGCGTTGA 
GCGJ^AACCAG CCGCACGGGG GGGATCCTCG CCGCACCAGT GCGCGATACT ATGAAACGTG 
CCGAACCGGG CAAA7VATGCC ATTGCTCATA CCGTT6ATCG CAACGGCTTA TGGCACGCGC 
TGACGCCGCA ATTTTTCCCT CGTGAGCTGT TACATGACTG TCTGACGCGC GCTCTAAATG 
AAGGCGCGAC TATTACCGAC GAAGCCTCGG CGCTGGAATA TTGCGGATTC CATCCTCAGT 
TGGTCGAAGG CCGTGCGGAT AACATTAAAG TCACGCGCCC GGAAGATTTG GCACTGGCCG 
AGTTTTACCT CACCCGAACC ATCCATCAGG AGAATACATA ACTCGAGGGG GGGCCCGGTA 
CCCAATTCGC CCTATAGTGA GTCGTATTAC GCGCGCTCAC TGGCCGTCGT TTTACAACGT 
CGTGACTGGG AAAACCCTGG CGTTACCCAA CTTAATCGCC TTGCAGCACA TCCCCCTTTC 
GCCAGCTGGC GTAATAGCGA AGAGGCCCGC ACCGATCGCC CTTCCCAACA GTTGCGCAGC 
CTGAATGGCG AATGGAAATT GTAAGCGTTA ATATTTTGTT AAAATTCGCG TTAAATTTTT 
GTTAAATCAG CTCATTTTTT AACCAATAGG CCGAAATCGG CAAAATCCCT TATAAATCAA 
AAGAATAGAC CGAGATAGGG TTGAGTGTTG TTCCAGTTTG GAACAAGAGT CCACTATTAA 
AGAACGTGGA CTCCAACGTC AAAGGGCGAA AAACCGTCTA TCAGGGCGAT GGCCCACTAC 
GTGAACCATC ACCCTAATCA AGTTTTTTGG GGTCGAGGTG CCGTAAAGCA CTAAATCGGA 
ACCCTAAAGG GAGCCCCCGA TTTAGAGCTT GACGGGGAAA GCCGGCGAAC GTGGCGAGAA 
AGGAAGGGAA GAJUIGCGAAA GGAGCGGGCG CTAGGGCGCT GGCAAGTGTA GCGGTCACGC 
TGCGCGTAAC CACCACACCC GCCGCGCTTA ATGCGCCGCT ACAGGGCGCG TCAG 
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Annex C 

DNA sequence of the vector construct pBScycIo 

ID PBSCYCLO PRELIMINARY; DNA; 7691 BP. 



SEQUENCE 


7691 BP; 


1844 A; 1888 C; 2148 G; 1811 T; 


GTGGCACTTT 


TCGGGGAAAT 


GTGCGCGGAA 


CCCCTATTTG 


TTTATTTTTC 


TAAATACATT 


CAAATATGTA 


TCCGCTCATG 


AGACAATAAC 


CCTGATAAAT 


GCTTCAATAA 


TATTGAAAAA 


GGAAGAGTAT 


GAGTATTCAA 


CATTTCCGTG 


TCGCCCTTAT 


TCCCTTTTTT 


GCGGCATTTT 


GCCTTCCTGT 


TTTTGCTCAC 


CCAGAAACGC 


TGGTGAAAGT 


AAAAGATGCT 


GAAGATCAGT 


TGGGTGCACG 


AGTGGGTTAC 


ATCGAACTGG 


ATCTCAACAG 


CGGTAAGATC 


CTTGAGAGTT 


TTCGCCCCGA 


AGAACGTTTT 


CCAATGATGA 


GCACTTTTAA 


AGTTCTGCTA 


TGTGGCGCGG 


TATTATCCCG 


TATTGACGCC 


GGGCAAGAGC 


AACTCGGTCG 


CCGCATACAC 


TATTCTCAGA 


ATGACTTGGT 


TGAGTACTCA 


CCAGTCACAG 


AAAAGCATCT 


TACGGATGGC 


ATGACAGTAA 


GAG7VATTATG 


CAGTGCTGCC 


ATAACCATGA 


GTGATAACAC 


TGCGGCCAAC 


TTACTTCTGA 


CAACGATCGG 


AGGACCGAAG 


GAGCTAACCG 


CTTTTTTGCA 


CAACATGGGG 


GATCATGTAA 


CTCGCCTTGA 


TCGTTGGGAA 


CCGGAGCTGA 


ATGAAGCCAT 


ACCAAACGAC 


GAGCGTGACA 


CCACGATGCC 


TGTAGCAATG 


GCAACAAC6T 


TGCGCAAACT 


ATTAACTGGC 


GAACTACTTA 


CTCTAGCTTC 


CCGGCAACAA 


TTAATAGACT 


GGATGGAGGC 


GGATAAAGTT 


GCAGGACCAC 


TTCTGCGCTC 


GGCCCTTCCG 


GCTGGCTGGT 


TTATTGCTGA 


TAAATCTGGA 


GCC6GTGAGC 


GTGGGTCTCG 


CGGTATCATT 


GCAGCACTGG 


GGCCAGATGG 


TAAGCCCTCC 


CGTATCGTAG 


TTATCTACAC 


GACGGGGAGT 


CAGGCAACTA 


TGGATGAACG 


AAATAGACAG 


ATCGCTGAGA 


TAGGTGCCTC 


ACTGATTAAG 


CATTGGTAAC 


TGTCAGACCA 


AGTTTACTCA 


TATATACTTT 


AGATTGATTT 


AAAACTTCAT 


TTTTAATTTA 


AT^GGATCTA 


GGTGAAGATC 


CTTTTTGATA 


ATCTCATGAC 


CAAAATCCCT 


TAACGTGAGT 


TTTCGTTCCA 


CTGAGCGTCA 


GACCCCGTAG 


AAAAGATCAA 


AGGATCTTCT 


TGAGATCCTT 


TTTTTCTGCG 


CGTAATCTGC 


TGCTTGCAAA 


CAAAAAAACC 


ACCGCTACCA 


GCGGTGGTTT 


GTTTGCCGGA 


TCAAGAGCTA 


CCAACTCTTT 


TTCCGAAGGT 


AACTGGCTTC 


AGCAGAGCGC 


AGATACCAAA 


TACTGTCCTT 


CTAGTGTAGC 


CGTAGTTAGG 


CCACCACTTC 


AAGAACTCTG 


TAGCACCGCC 


TACATACCTC 


GCTCTGCTAA 


TCCTGTTACC 


AGTGGCTGCT 


GCCAGTGGCG 


ATAAGTCGTG 


TCTTACCGGG 


TTGGACTCAA 


GACGATAGTT 


AGCGGATAAG 


GCGCAGCGGT 


CGGGCTGAAC 


GGGGGGTTCG 


TGCACACAGC 


CCAGCTTGGA 


GCGAACGACC 


TACACCGAAC 


TGAGATACCT 


ACAGCGTGAG 


CTATGAGAAA 


GCGCCACGCT 


TCCCGAAGGG 


AGAAAGGCGG 


ACAGGTATCC 


GGTAAGCGGC 


AGGGTCGGAA 


CAGGAGAGCG 


CACGAGGGAG 


CTTCCAGGGG 


GAAACGCCTG 


GTATCTTTAT 


AGTCCTGTCG 


GGTTTCGCCA 


CCTCTGACTT 


GAGCGTCGAT 


TTTTGTGATG 


CTCGTCAGGG 


GGGCGGAGCC 


TATGGAAAAA 


CGCCAGCAAC 


GCGGCCTTTT 


TACGGTTCCT 


GGCCTTTTGC 


TGGCCTTTTG 


CTCACATGTT 


CTTTCCTGCG 


TTATCCCCTG 


ATTCTGTGGA 


TAACCGTATT 


ACCGCCTTTG 


AGTGAGCTGA 


TACCGCTCGC 


CGCAGCCGAA 


CGACCGAGCG 


CAGCGAGTCA 


GTGAGCGAGG 
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AAGCGGAAGA GCGCCCAATA CGCAAACCGC 
GCAGCTGGCA CGACAGGTTT CCCGACTGGA 
TGAGTTAGCT CACTCATTAG GCACCCCAGG 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC 
CCAAGCGCGC AATTAACCCT CACTAAAGGG 
GCCGCTCTAG AACTAGTGGA TCCCCCGGGC 
ATATCGGGAT AGATCTTGGC ACCTCGGGCG 
AGGTGGTTGC TGCGCAAACG GAAAAGCTGA 
AACAAGACCC GGAACAGTGG TGGCAGGCAA 
AGCATTCTCT GCAGGACGTT AAAGCATTGG 
TGCTGGATGC TCAGCAACGG GTGTTACGCC 
CGCAAGAGTG CACTTTGCTG GAAGCGCGAG 
TGATGATGCC CGGATTTACT GCGCCTAAAT 
TATTCCGTCA AATCGACAAA GTATTATTAC 
GGGAGTTTGC CAGCGATATG TCTGACGCAG 
GTGACTGGAG TGACGTCATG CTGCAGGCTT 
TATACGAAGG CAGCGAAATT ACTGGTGCTT 
TGGCGACGGT GCCAGTTGTC GCAGGCGGTG 
GAATGGTTGA TGCTAATCAG GCAATGTTAT 
TCAGCGAAGG GTTCTTAAGC AAGCCAGAAA 
CGCAACGTTG GCATTTAATG TCTGTGATGC 
CGAAATTAAC CGGCCTGAGC AATGTCCCAG 
AAAGTGCCGA GCCAGTTTGG TTTCTGCCTT 
ATCCCCAGGC GAAGGGGGTT TTCTTTGGTT 
CGCGAGCAGT GCTGGAAGGC GTGGGTTATG 
CCTGCGGTAT TAAACCGCAA AGTGTTACGT 
GGCGTCAGAT GCTGGCGGAT ATCAGCGGTC 
TGGGGCCAGC ACTGGGCGCA GCAAGGCTGG 
TCATTGAATT GTTGCCGCAA CTACCGTTAG 
ATGCCGCTTA TCAGCCACGA CGAGAAACGT 
TAATGGCGTA AAAGCTTGAG GAGAAATTAA 
CCGGCTCGAT TGGTTGCAGC ACGCTGGACG 
TAGTTGCGCT GGTGGCAGGC AATIAATGTCA 
CTCCCCGCTA TGCCGTAATG GACGATGAAG 
AGCAACAGGG . TAGCCGCACC GAAGTCTTAA 
CGCTTGAGGA TGTTGATCAG GTGATGGCAG 
CGCTTGCTGC GATCCGCGCG GGTT^AAACCA 



CTCTCCCCGC GCGTTGGCCG ATTCATTAAT 
AAGCGGGCAG TGAGCGCAAC GCAATTAATG 
CTTTACACTT TATGCTTCCG GCTCGTATGT 
ACACAGGAAA CAGCTATGAC CATGATTACG 
AACAAAAGCT GGAGCTCCAC CGCGGTGGCG 
TGCAGGAATT CGAGGAGAAA TTAACCATGT 
TAAAAGTTAT TTTGCTCAAC GAGCAGGGTG 
CCGTTTCGCG CCCGCATCCA CTCTGGTCGG 
CTGATCGCGC AATGAAAGCT CTGGGCGATC 
GTATTGCCGG CCAGATGCAC GGAGCAACCT 
CTGCCATTTT GTGGAACGAC GGGCGCTGTG 
TTCCGCAATC GCGGGTGATT ACCGGCAACC 
TGCTATGGGT TCAGCGGCAT GAGCCGGAGA 
CGAAAGATTA CTTGCGTCTG CGTATGACGG 
CTGGCACCAT GTGGCTGGAT GTCGCAAAGC 
GCGACTTATC TCGTGACCAG ATGCCCGCAT 
TGTTACCTGA AGTTGCGAAA GCGTGGGGTA 
GCGACAATGC AGCTGGTGCA GTTGGTGTGG 
CGCTGGGGAC GTCGGGGGTC TATTTTGCTG 
GCGCCGTACA TAGCTTTTGC CATGCGCTAC 
TGAGTGCAGC GTCGTGTCTG GATTGGGCCG 
CTTTAATCGC TGCAGCTCAA CAGGCTGATG 
ATCTTTCCGG CGAGCGTACG CCACACAATA 
TGACTCATCA ACATGGCCCC AATGAACTGG 
CGCTGGCAGA TGGCATGGAT GTCGTGCATG 
TGATTGGGGG CGGGGCGCGT AGTGAGTACT 
AGCAGCTCGA TTACCGTACG GGGGGGGATG 
CGCAGATCGC GGCGAATCCA GAGAAATCGC 
AACAGTCGCA TCTACCAGAT GCGCAGCGTT 
TCCGTCGCCT CTATCAGCAA CTTCTGCCAT 
CCATGAAGCA ACTCACCATT CTGGGCTCGA 
TGGTGCGCCA TAATCCCGAA CACTTCCGCG 
CTCGCATGGT AGAACAGTGC CTGGAATTCT 
CGAGTGCGAA ACTTCTTAAA ACGATGCTAC 
GTGGGCAACA AGCCGCTTGC GATATGGCAG 
CCATTGTTGG CGCTGCTGGG CTGTTACCTA 
TTTTGCTGGC CAATAAAGAA TCACTGGTTA 
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CCTGCGGACG TCTGTTTATG GACGCCGTAA AGCAGAGCAA AGCGCAATTG TTACCGGTCG 
ATAGCGAACA TAACGCCATT TTTCAGAGTT TACCGCAACC TATCCAGCAT AATCTGGGAT 
ACGCTGACCT TGAGCAAAAT GGCGTGGTGT CCATTTTACT TACCGGGTCT GGTGGCCCTT 
TCCGTGAGAC GCCATTGCGC GATTTGGCAA CAATGACGCC GGATCAAGCC TGCCGTCATC 
CGAACTGGTC GATGGGGCGT AAAATTTCTG TCGATTCGGC TACCATGATG AACAAAGGTC 
TGGAATACAT TGAAGCGCGT TGGCTGTTTA ACGCCAGCGC CAGCCAGATG GAAGTGCTGA 
TTCACCCGCA GTCAGTGATT CACTCAATGG TGCGCTATCA GGACGGCAGT GTTCTGGCGC 
AGCTGGGGGA ACCGGATATG CGTACGCCAA TTGCCCACAC CATGGCATGG CCGAATCGCG 
TGAACTCTGG CGTGAAGCCG CTCGATTTTT GCAAACTAAG TGCGTTGACA TTTGCCGCAC 
CGGATTATGA TCGTTATCCA TGCCTGAAAC TGGCGATGGA GGCGTTCGAA CAAGGCCAGG 
CAGCGACGAC AGCATTGAAT GCCGCAAACG AAATCACCGT TGCTGCTTTT CTTGCGCAAC 
AAATCCGCTT TACGGATATC GCTGCGTTGA ATTTATCCGT ACTGGAAAAA ATGGATATGC 
GCGAACCACA ATGTGTGGAC GATGTGTTAT CTGTTGATGC GAACGCGCGT GAAGTCGCCA 
GAAAAGAGGT GATGCGTCTC GCAAGCTGAG TCGACGAGGA GAAATTAACC ATGGCAACCA 
CTCATTTGGA TGTTTGCGCC GTGGTTCCGG CGGCCGGATT TGGCCGTCGA ATGCAAACGG 
AATGTCCTAA GCAATATCTC TCAATCGGTA ATCAAACCAT TCTTGAACAC TCGGTGCATG 
CGCTGCTGGC GCATCCCCGG GTGAAACGTG TCGTCATTGC CATAAGTCCT GGCGATAGCC 
GTTTTGCACA ACTTCCTCTG GCGAATCATC CGCAAATCAC CGTTGTAGAT GGCGGTGATG 
AGCGTGCCGA TTCCGTGCTG GCAGGTCTGA AAGCCGCTGG CGACGCGCAG TGGGTATTGG 
TGCATGACGC CGCTCGTCCT TGTTTGCATC AGGATGACCT CGCGCGATTG TTGGCGTTGA 
GCGAAACCAG CCGCACGGGG GGGATCCTCG CCGCACCAGT GCGCGATACT ATGAAACGTG 
CCGAACCGGG CAAAAATGCC ATTGCTCATA CCGTTGATCG CAACGGCTTA TGGCACGCGC 
TGACGCCGCA ATTTTTCCCT CGTGAGCTGT TACATGACTG TCTGACGCGC GCTCTAAATG 
AAGGCGCGAC TATTACCGAC GAAGCCTCGG CGCTGGAATA TTGCGGATTC CATCCTCAGT 
TGGTCGAAGG CCGTGCGGAT AACATTAAAG TCACGCGCCC GGAAGATTTG GCACTGGCCG 
AGTTTTACCT CACCCGAACC ATCCATCAGG AGAATACATA ATGCGAATTG GACACGGTTT 
TGACGTACAT GCCTTTGGCG GTGAAGGCCC AATTATCATT GGTGGCGTAC GCATTCCTTA 
CGAAAAAGGA TTGCTGGCGC ATTCTGATGG CGACGTGGCG CTCCATGCGT TGACCGATGC 
ATTGCTTGGC GCGGCGGCGC TGGGGGATAT CGGCAAGCTG TTCCCGGATA CCGATCCGGC 
ATTTAAAGGT GGCGATAGCC GCGAGCTGCT ACGCGAAGCC TGGCGTCGTA TTCAGGCGAA 
GGGTTATACC CTTGGCAACG TCGATGTCAC TATCATCGCT CAGGCACCGA AGATGTTGCC 
GCACATTCCA CAAATGCGCG TGTTTATTGC CGAAGATCTC GGCTGCCATA TGGATGATGT 
TAACGTGAAA GCCACTACTA CGGAAAAACT GGGATTTACC GGACGTGGGG AAGGGATTGC 
CTGTGAAGCG GTGGCGCTAC TCATTAAGGC AACAAAATGA CTCGAGGAGG AGAAATTAAC 
CATGCGGACA CAGTGGCCCT CTCCGGCAAA ACTTAATCTG TTTTTATACA TTACCGGTCA 
GCGTGCGGAT GGTTACCACA CGCTGCAAAC GCTGTTTCAG TTTCTTGATT ACGGCGACAC 
CATCAGCATT GAGCTTCGTG ACGATGGGGA TATTCGTCTG TTAACGCCCG TTGAAGGCGT 



wo 02/083720 



10/39 



PCT/EP02/04005 



GGAACATGAA GATAACCTGA 
CAGCGGGCGT CTTCCGACGG 
GGGCGGCGGT CTCGGCGGTG 
TCTCTGGCAA TGCGGGCTAA 
AGATGTTCCT GTCTTTGTTC 
AACGCCGGTG GATCCGCCAG 
GACTCCGGTG ATTTTTAAAG 
AACGTTGCTA AAATGTGAAT 
CGAGGTTGAT GCGGTGCTTT 
AGGGGCCTGT GTCTTTGCTG 
AGCCCCGGAA TGGCTCAATG 
AGCCATGCTT TAAGGTACCC 
CCGTCGTTTT ACT^CGTCGT 
CAGCACATCC CCCTTTCGCC 
CCCAACAGTT GCGCAGCCTG 
ATTCGCGTTA AATTTTTGTT 
AATCCCTTAT AAATCAAAAG 
CAAGAGTCCA CTATTAAAGA 
GGGCGATGGC CCACTACGTG 
TAAAGCACTA AATCGGAACC 
GGCGAACGTG GCGAGAAAG6 
AAGTGTAGCG GTCACGCTGC 
GGGCGCGTCA G 



TCGTTCGCGC AGCGCGATTG 
GAAGCGGTGC GAATATCAGC 
GTTCATCCAA TGCCGCGACG 
GCATGGATGA GCTGGCGGAA 
GGGGGCATGC CGCGTTTGCC 
AGAAGTGGTA TCTGGTGGCG 
ATCCTGAACT CCCGCGCAAT 
TCAGCAATGA TTGCGAGGTT 
CCTGGCTGTT AGAATACGCC 
AATTTGATAC AGAGTCTGJ^A 
GCTTTGTGGC GAAAGGCGCT 
AATTCGCCCT ATAGTGAGTC 
GACTGGGAAA ACCCTGGCGT 
AGCTGGCGTA ATAGCGAAGA 
AATGGCGAAT GGAAATTGTA 
AAATCAGCTC ATTTTTTAAC 
AATAGACCGA GATAGGGTTG 
ACGTGGACTC CAACGTCAAA 
AACCATCACC CTAATCAAGT 
CTAAAGGGAG CCCCCGATTT 
AAGGGAAGAA AGCGAAAGGA 
GCGTAACCAC CACACCCGCC 



TTGATGAAAA CTGCGGCAGA 
ATTGACAAGC GTTTGCCGAT 
GTCCTGGTGG CATTAAATCA 
ATGGGGCTGA CGCTGGGCGC 
GAAGGCGTTG GTGAAATACT 
CACCCTGGTG TAAGTATTCC 
ACGCCAAAAA GGTCAATAGA 
ATCGCAAGAA AACGTTTTCG 
CCGTCGCGCC TGACTGGGAC 
GCCCGCCAGG TGCTAGAGCA 
AATCTTTCCC CATTGCACAG 
GTATTACGCG CGCTCACTGG 
TACCCAACTT AATCGCCTTG 
GGCCCGCACC GATCGCCCTT 
AGCGTTAATA TTTTGTTAAA 
CAATAGGCCG AAATCGGCAA 
AGTGTTGTTC CAGTTTGGAA 
GGGCGAAAAA CCGTCTATCA 
TTTTTGGGGT CGAGGTGCCG 
AGAGCTTGAC GGGGAAAGCC 
GCGGGCGCTA GGGCGCTGGC 
GCGCTTAATG CGCCGCTACA 
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Annex D 

DNA sequence of the vector construct pACYCgcpE 

ID PACYCGCPE PRELIMINARY; DNA; 5109 BP. 



SEQUENCE 


5109 BP; 


1194 A; 1365 C; 1324 G; 1226 T; 0 OTHER; 


GAATTCCGGA 


TGAGCATTCA 


TCAGGCGGGC 


AAG7VATGTGA 


ATAAAGGCCG 


GATAAAACTT 


GTGCTTATTT 


TTCTTTACGG 


TCTTTAAAAA 


GGCCGTAATA 


TCCAGCTGAA 


CGGTCTGGTT 


ATA6GTACAT 


TGAGCAACTG 


ACTGAAATGC 


CTCA7VAATGT 


TC TTTACG AT 


GCCATTGGGA 


TATATC7UVCG 


GTGGTATATC 


CAGTGATTTT 


TTTCTCCATT 


TTAGCTTCCT 


TAGCTCCTGA 


AAATCTCGAT 


AACTCAAAAA 


ATACGCCCGG 


TAGTGATCTT 


ATTTCATTAT 


GGTGAAAGTT 


GGAACCTCTT 


ACGTGCCGAT 


CAACGTCTCA 


TTTTCGCCAA 


AAGTTGGCCC 


AGGGCTTCCC 


GGTATCAACA 


GGGACACCAG 


GATTTATTTA 


TTCTGCGAAG 


TGATCTTCCG 


TCACAGGTAT 


TTATTCGGCG 


CAAAGTGCGT 


CGGGTGATGC 


TGCCAACTTA 


CTGATTTAGT 


GTATGATGGT 


GTTTTTGAGG 


TGCTCCAGTG 


GCTTCTGTTT 


CTATCAGCTG 


TCCCTCCTGT 


TCAGCTACTG 


ACGGGGTGGT 


GCGTAACGGC 


AAAAGCACCG 


CCGGACATCA 


GCGCTAGCGG 


AGTGTATACT 


GGCTTACTAT 


GTTGGCACTG 


ATGAGGGTGT 


CAGTGAAGTG 


CTTCATGTGG 


CAGGAGAAAA 


AAGGCTGCAC 


CGGTGCGTCA 


GCAGAATATG 


TGATACAGGA 


TATATTCCGC 


TTCCTCGCTC 


ACTGACTCGC 


TACGCTCGGT 


CGTTCGACTG 


CGGCGAGCGG 


AAATGGCTTA 


CGAACGGGGC 


GGAGATTTCC 


TGGAAGATGC 


CAGGAAGATA 


CTTAACAGGG 


AAGTGAGAGG 


GCCGCGGCAA 


AGCCGTTTTT 


CCATAGGCTC 


CGCCCCCCTG 


ACAAGCATCA 


CGAAATCTGA 


CGCTCAAATC 


AGTGGTGGCG 


AAACCCGACA 


GGACTATAAA 


GATACCAGGC 


GTTTCCCCCT 


GGCGGCTCCC 


TCGTGCGCTC 


TCCTGTTCCT 


GCCTTTCGGT 


TTACCGGTGT 


CATTCCGCTG 


TTATGGCCGC 


GTTTGTCTCA 


TTCCACGCCT 


GACACTCAGT 


TCCGGGTAGG 


CAGTTCGCTC 


CAAGCTGGAC 


TGTATGCACG 


AACCCCCCGT 


TCAGTCCGAC 


CGCTGCGCCT 


TATCCGGTAA 


CTATCGTCTT 


GAGTCCAACC 


CGGAAAGACA 


TGCAAAAGCA 


CCACTGGCAG 


CAGCCACTGG 


TAATTGATTT 


AGAGGAGTTA 


GTCTTGAAGT 


CATGCGCCGG 


TTAAGGCTAA 


ACTGAAAGGA 


CAAGTTTTGG 


TGACTGCGCT 


CCTCCAAGCC 


AGTTACCTCG 


GTTCAAAGAG 


TTGGTAGCTC 


AGAGAACCTT 


CGAAAAACCG 


CCCTGCAAGG 


CGGTTTTTTC 


GTTTTCAGAG 


CAAGAGATTA 


CGCGCAGACC 


AAAACGATCT 


CAAGAAGATC 


ATCTTATTAA 


TCAGATAAAA 


TATTTCTAGA 


TTTCAGTGCA 


ATTTATCTCT 


TCAAATGTAG 


CACCTGAAGT 


CAGCCCCATA 


CGATATAAGT 


TGTAATTCTC 


ATGTTTGACA 


GCTTATCATC 


GATAAGCTTT 


AATGCGGTAG 


TTTATCACAG 


TTAAATTGCT 


AACGCAGTCA 


GGCACCGTGT 


ATGAAATCTA 


ACAATGCGCT 


CATCGTCATC 


CTCGGCACCG 


TCACCCTGGA 


TGCTGTAGGC 


ATAGGCTTGG 


TTATGCCGGT 


ACTGCCGGGC 


CTCTTGCGGG 


ATATCGTCCA 


TTCCGACAGC 


ATCGCCAGTC 


ACTATGGCGT 


GCTGCTAGCG 


CTATATGCGT 


TGATGCAATT 


TCTATGCGCA 


CCCGTTCTCG 


GAGCACTGTC 


CGACCGCTTT 


GGCCGCCGCC 


CAGTCCTGCT CGCTTCGCTA 


CTTGGAGCCA 


CTATCGACTA 


CGCGATCATG 


GCGACCACAC 


CCGTCCTGTG 


GATCCGAGGA 


GA/^TTAACC 


ATGCATAACC 


AGGCTCCAAT 


TCAACGTAGA 
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AAATCAACAC GTATTTACGT TGGGAATGTG 
CAGTCCATGA CCAATACGCG TACGACAGAC 
CTGGAACGCG TTGGCGCTGA TATCGTCCGT 
GCGTTCAAAC TCATCAAACA GCAGGTTAAC 
TATCGCATTG CGCTGAAAGT AGCGGAATAC 
AATATCGGTA ATGAAGAGCG TATTCGCATG 
CCGATCCGTA TTGGCGTTAA CGCCGGATCG 
GAACCGACGC CGCAGGCGTT GCTGGAATCT 
CTGAACTTCG ATCAGTTCAA AGTCAGCGTG 
TCTTATCGTT TGCTGGCAAA ACAGATCGAT 
GGTGGTGCGC GCAGCGGGGC AGTAAAATCC 
GGCATCGGCG ACACGCTGCG CGTATCGCTG 
GGTTTCGATA TTTTGAAATC GCTGCGTATC 
CCGACCTGTT CGCGTCAGGA ATTTGATGTT 
CTGGAAGATA TCATCACTCC GATGGACGTT 
GGTGAGGCGC TGGTTTCTAC ACTCGGCGTC 
GAAGATGGCG TGCGCAAAGA CCGTCTGGAC 
CGCATTCGTG CGAAAGCCAG TCAGCTGGAC 
GAAAAATAAG TCGACCGATG CCCTTGAGAG 
CGCGGGGCAT GACTATCGTC GCCGCACTTA 
GACAGGTGCC GGCAGCGCTC TG6GTCATTT 
CGATGATCGG CCTGTCGCTT GCGGTATTCG 
TCACTGGTCC CGCCACCAAA CGTTTCGGCG 
CCGACGCGCT GGGCTACGTC . TTGCTGGCGT 
TTATGATTCT TCTCGCTTCC GGCGGCATCG 
GGCAGGTAGA TGACGACCAT CAGGGACAGC 
TAACTTCGAT CACTGGACCG CTGATCGTCA 
GGAACGGGTT GGCATGGATT GTAGGCGCCG 
GTCGCGGTGC ATGGAGCCGG GCCACCTCGA 
CGGATTCACC ACTCC/^GAA TTGGAGCCAA 
AACCAACCCT TGGCAGAACA TATCCATCGC 
CATCTCGGGC AGCGTTGGGT CCTGGCCACG 
GACCCGGCTA GGCTGGCGGG GTTGCCTTAC 
GCGAACGTGA AGCGACTGCT GCTGCAAAAC 
CTTCGGTTTC CGTGTTTCGT AAAGTCTGGA 
TTGCCCGCAA CAGAGAGTGG AACCAACCGG 
CGCCATGAGC GGCCTCATTT CTTATTCTGA 



CCGATTGGCG ATGGTGCTCC CATCGCCGTA 
GTCGAAGCAA CGGTCAATCA AATCAAGGCG 
GTATCCGTAC CGACGATGGA CGCGGCAGAA 
GTGCCGCTGG TGGCTGACAT CCACTTCGAC 
GGCGTCGATT GTCTGCGTAT TAACCCTGGC 
GTGGTTGACT GTGCGCGCGA TAAAAACATT 
CTGGAAAAAG ATCTGCAAGA AAAGTATGGC 
GCCATGCGTC ATGTTGATCA TCTCGATCGC 
AAAGCGTCTG ACGTCTTCCT CGCTGTTGAG 
CAGCCGTTGC ATCTGGGGAT CACCGAAGCC 
GCCATTGGTT TAGGTCTGCT GCTGTCTGAA 
GCGGCCGATC CGGTCGAAGA GATCAAAGTC 
CGTTCGCGAG GGATCAACTT CATCGCCTGC 
ATCGGTAC6G TTAACGCGCT GGAGCAACGC 
TCGATTATCG GCTGCGTGGT GAATGGCCCA 
ACCGGCGGCA ACAAGAAAAG CGGCCTCTAT 
AACAACGATA TGATCGACCA GCTGGAAGCA 
GAAGCGCGTC GAATTGACGT TCAGCAGGTT 
CCTTCAACCC AGTCAGCTCC TTCCGGTGGG 
TGACTGTCTT CTTTATCATG CAACTCGTAG 
TCGGCGAGGA CCGCTTTCGC TGGAGCGCGA 
GAATCTTGCA CGCCCTCGCT CAAGCCTTCG 
AGAAGCAGGC CATTATCGCC GGCATGGCGG 
TCGCGACGCG AGGCTGGATG GCCTTCCCCA 
GGATGCCCGC GTTGCAGGCC ATGCTGTCCA 
TTCAAGGATC GCTCGCGGCT CTTACCAGCC 
CGGCGATTTA TGCCGCCTCG GCGAGCACAT 
CCCTATACCT TGTCTGCCTC CCCGCGTTGC 
CCTGAATGGA AGCCGGCGGC ACCTCGCTAA 
TCAATTCTTG CGGAGAACTG TGAATGCGCA 
GTCCGCCATC TCCAGCAGCC GCACGCGGCG 
GGTGCGCATG ATGGTGCTCC TGTCGTTGAG 
TGGTTAGCAG AATGAATCAC CGATACGCGA 
GTCTGCGACC TGAGCAACAA CATGAATGGT 
AACGCGGAAG TCCCCTACGT GCTGCTGAAG 
TGATACCACG ATACTATGAC TGAGAGTCAA 
GTTACAACAG TCCGCACCGC TGTCCGGTAG 
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CTCCTTCCGG TGGGCGCGGG 
CATGCAACTC GTAGGACAGG 
CACCATACCC ACGCCGAAAC 
ATGCTGCTGG CTACCCTGTG 
ATCAGGCTCT GGGAGGCAGA 
GCCTGAGCAA ACTGGCCTCA 
AATAAACCGG TAAACCAGCA 
ACGACCGGGT CGAATTTGCT 
AGGCGTAGCA CCAGGCGTTT 
CTGCCACTCA TCGCAGTACT 
ACAGACGGCA TGATGAACCT 
ATATTTGCCC ATGGTGAAAA 
AAAACTGGTG AAACTCACCC 
TTTAGGGAAA TAGGCCAGGT 
AAACTGCCGG AAATCGTCGT 
ATGGAAAACG GTGTAACAAG 
TGCCATACG 



GCATGACTAT CGTCGCCGCA 
TGCCGGCAGC GCCCAACAGT 
AAGCGCCCTG CACCATTATG 
GAACACCTAC ATCTGTATTA 
ATAAATGATC ATATCGTCAA 
GGCATTTGAG AAGCACACGG 
ATAGACATAA GCGGCTATTT 
TTCGAATTTC TGCCATTCAT 
AAGGGCACCA ATAACTGCCT 
GTTGTAATTC ATTAAGCATT 
GAATCGCCAG CGGCATCAGC 
CGGGGGCGAA GAAGTTGTCC 
AGGGATTGGC TGAGACGAAA 
TTTCACCGTA ACACGCCACA 
GGTATTCACT CCAGAGCGAT 
GGTGAACACT ATCCCATATC 



CTTATGACTG TCTTCTTTAT 
CCCCCGGCCA CGGGGCCTGC 
TTCCGGATCT GCATCGCAGG 
ACGAAGCGCT AACCGTTTTT 
TTATTACCTC CACGGGGAGA 
TCACACTGCT TCCGGTAGTC 
AACGACCCTG CCCTGAACCG 
CCGCTTATTA TCACTTATTC 
TAAAAAAATT ACGCCCCGCC 
CTGCCGACAT GGAAGCCATC 
ACCTTGTCGC CTTGCGTATA 
ATATTGGCCA CGTTTAAATC 
AACATATTCT CAATAAACCC 
TCTTGCGAAT ATATGTGTAG 
GAAAACGTTT CAGTTTGCTC 
ACCAGCTCAC CGTCTTTCAT 
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Annex E 

DNA sequence of the plasmid pBScaro14 

ID PBSCAR014 PRELIMINARY; DNA; 7494 BP . SQ SEQUENCE 7494 BP; 

1722 A; 1935 C; 2026 G; 1811 T; 

GTGGCACTTT TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT 
CAAATATGTA TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA 
66AAGAGTAT GAGTATTCAA CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT 
GCCTTCCTGT TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT 
TGGGTGCACG AGTGGGTTAC ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT 
TTCGCCCCGA AGAACGTTTT CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG 
TATTATCCCG TATTGACGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA 
ATGACTTGGT TGAGTACTCA CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA 
GAGAATTATG CAGTGCTGCC ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA 
CAACGATCGG AGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA 
CTCGCCTTGA TCGTTGGGAA CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA 
CCACGATGCC TGTAGCAATG GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA 
CTCTAGCTTC CCGGCAACAA TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC 
TTCTGCGCTC GGCCCTTCCG GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC 
GTGGGTCTCG CGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG 
TTATCTACAC GACGGGGAGT CAG6CAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA 
TAGGTGCCTC ACTGATTAAG CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT 
AGATTGATTT AAAACTTCAT TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA 
ATCTCATGAC CAAAATCCCT TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG 
AAAAGATCAA AGGATCTTCT TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA 
CAAAAAT^CC ACCGCTACCA GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT 
TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC 
CGTAGTTAGG CCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 
TCCTGTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA 
GACGATAGTT ACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC 
CCAGCTTGGA GCGAACGACC TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA 
GCGCCACGCT TCCCGAAGGG AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA 
CAGGAGAGCG CACGAGGGAG CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG 
GGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC 
TATGGAAAAA CGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG 
CTCACATGTT CTTTCCTGCG TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG 
AGTGAGCTGA TACCGCTCGC CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG 
AAGCGGAAGA GCGCCCAATA CGCA7UVCCGC CTCTCCCCGC GCGTTGGCCG ATTCATTAAT 
GCAGCTGGCA CGACAGGTTT CCCGACTGGA AAGCGGGCAG TGAGCGCAAC GCAATTAATG 
TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC CATGATTACG 
CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT GGAGCTCCAC CGCGGTGGCG 
GCCGCTCTAG AACTAGTGGA TCCCCCGGGC TGCAGGAATT GCCGTAAATG TATCCGTTTA 
TAAGGACAGC CCGAATGACG GTCTGCGCAA AAAAACACGT TCATCTCACT CGCGATGCTG 
CGGAGCAGTT ACTGGCTGAT ATTGATCGAC GCCTTGATCA GTTATTGCCC GTGGAGGGAG 
AACGGGATGT TGTGGGTGCC GCGATGCGTG AAGGTGCGCT GGCACCGGGA AAACGTATTC 
GCCCCATGTT GCTGTTGCTG ACCGCCCGCG ATCTGGGTTG CGCTGTCAGC CATGACGGAT 
TACTGGATTT GGCCTGTGCG GTGGAAATGG TCCACGCGGC TTCGCTGATC CTTGACGATA 
TGCCCTGCAT GGACGATGCG AAGCTGCGGC GCGGACGCCC TACCATTCAT TCTCATTACG 
GAGAGCATGT GGCAATACTG GCGGCGGTTG CCTTGCTGAG TAAAGCCTTT GGCGTAATTG 
CCGATGCAGA TGGCCTCACG CCGCTGGCAA AAAATCGGGC GGTTTCTGAA CTGTCAAACG 
CCATCGGCAT GCAAGGATTG GTTCAGGGTC AGTTCAAGGA TCTGTCTGAA GGGGATAAGC . 
CGCGCAGCGC TGAAGCTATT TTGATGACGA ATCACTTTAA AACCAGCACG CTGTTTTGTG 
CCTCCATGCA GATGGCCTCG ATTGTTGCGA ATGCCTCCAG CGAAGCGCGT GATTGCCTGC 
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ATCGTTTTTC ACTTGATCTT GGTCAGGCAT 
TGACCGACAC CGGTAAGGAT AGCAATCAGG 
TAGGCCCGAG GGCGGTTGAA GAACGTCTGA 
TCTCTGCGGC CTGCCAACAC GGGCACGCCA 
AAAAACTCGC TGCCGTCAGT TAAGCTTATG 
GGCTATGCAA CCGCATTATG ATCTGATTCT 
CGCCCTGCGT CTTCAGCAGC AGCT^CCTGA 
CCAGGCGGGC GGGAATCATA CGTGGTCATT 
TCGTTGGATA GCTCCGCTGG TGGTTCATCA 
ACGCCGTCGT AAGCTGAACA GCGGCTACTT 
TTTACAGCGA CAGTTTGGCC CGCACTTGTG 
GGAATCTGTT CGGTTGAAAA AGGGTCAGGT 
GGGTTATGCG GCAAATTCAG CACTGAGCGT 
GCGATTGAGC CACCCGCATG GTTTATCGTC 
GCAAAATGGT TATCGCTTCG TGTACAGCCT 
AGACACGCAC TATATTGATA ATGCGACATT 
CGACTATGCC GCGCAACAGG GTTGGCAGCT 
CTTACCCATT ACTCTGTCGG GCAATGCCGA 
TAGTGGATTA CGTGCCGGTC TGTTCCATCC 
TGCCGTGGCC GACCGCCTGA GTGCACTTGA 
CATTACGCAT TTTGCCCGCG AGCGCTGGCA 
CATGCTGTTT TTAGCCGGAC CCGCCGATTC 
TTTACCTGAA GATTTAATTG CCCGTTTTTA 
ACGTATTCTG AGCGGCAAGC CGCCTGTTCC 
GACTCATCGT TAAAGAGCGA CTACATGAAA 
GGCCTGGCAC TGGCAATTCG TCTACAAGCT 
CGTGATAAAC CCGGCGGTCG GGCTTATGTC 
GGCCCGACGG TTATCACCGA TCCCAGTGCC 
CAGTTAAAAG AGTATGTCGA ACTGCTGCCG 
TCAGGGAAGG TCTTTAATTA CGATAACGAT 
TTTAATCCCC GCGATGTCGA AGGTTATCGT 
AAAGAAGGCT ATCTAAAGCT CGGTACTGTC 
GCCGCACCTC AACTGGCGAA ACTGCAGGCA 
TACATCGAAG ATGAACATCT GCGCCAGGCG 
AATCCCTTCG CCACCTCATC CATTTATACG 
GTCTGGTTTC CGCGTGGCGG CACCGGCGCA 
GATCTGGGTG GCGAAGTCGT GTTAAACGCC 
AAGATTGAAG CCGTGCATTT AGAGGACGGT 
AATGCAGATG TGGTTCATAC CTATCGCGAC 
CAGTCCAACA AACTGCAGAC TAAGCGCATG 
TTGAATCACC ATCATGATCA GCTCGCGCAT 
GAGCTGATTG ACGAAATTTT TAATCATGAT 
CACGCGCCCT GTGTCACGGA TTCGTCACTG 
TTGGCGCCGG TGCCGCATTT AGGCACCGCG 
CTACGCGACC GTATTTTTGC GTACCTTGAG 
CTGGTCACGC ACCGGATGTT TACGCCGTTT 
GGCTCAGCCT TTTCTGTGGA GCCCGTTCTT 
CGCGATAAAA CCATTACTAA TCTCTACCTG 
ATTCCTGGCG TCATCGGCTC GGCAAAAGCG 
TGAATAATCC GTCGTTACTC AATCATGCGG 
TTGCGACAGC CTCAAAGTTA TTTGATGCAA 
CCTGGTGCCG. CCATTGTGAC GATGTTATTG 
AGCCTGCCTT ACAAACGCCC GAACAACGTC 
CCTATGCAGG ATCGCAGATG CACGAACCGG 
CTCATGATAT CGCCCCGGCT TACGCGTTTG 



TTCAACTGCT GGACGATTTG ACCGATGGCA 
ACGCCGGT7UV ATCGACGCTG GTCAATCTGT 
GACAACATCT TCAGCTTGCC AGTGAGCATC 
CTCAACATTT TATTCAGGCC TGGTTTGACA 
TGCACCGGTC AGCCTGTCTT AAGTGGGAGC 
CGTGGGGGCT GGACTCGCGA ATGGCCTTAT 
TATGCGTATT TTGCTTATCG ACGCCGCACC 
TCACCACGAT GATTTGACTG AGAGCCAACA 
CTGGCCCGAC TATCAGGTAC GCTTTCCCAC 
TTGTATTACT TCTCAGCGTT TCGCTGAGGT 
GATGGATACC GCGGTCGCAG AGGTTAATGC 
TATCGGTGCC CGCGCGGTGA TTGACGGGCG 
GGGCTTCCAG GCGTTTATTG GCCAGGAATG 
TCCCATTATC ATGGATGCCA CGGTCGATCA 
GCCGCTCTCG CCGACCAGAT TGTTAATTGA 
AGATCCTGAA TGCGCGCGGC AAAATATTTG 
TCAGACACTG CTGCGAGAAG AACAGGGCGC 
CGCATTCTGG CAGCAGCGCC CCCTGGCCTG 
TACCACCGGC TATTCACTGC CGCTGGCGGT 
TGTCTTTACG TCGGCCTCAA TTCACCATGC 
GCAGCAGGGC TTTTTCCGCA TGCTGAATCG 
ACGCTGGCGG GTTATGCAGC GTTTTTATGG 
TGCGGGAAAA CTCACGCTGA CCGATCGGCT 
GGTATTAGCA GCATTGCTVAG CCATTATGAC 
CCAACTACGG TAATTGGTGC AGGCTTCGGT 
GCGGGGATCC CCGTCTTACT GCTTGAACAA 
TACGAGGATC AGGGGTTTAC CTTTGATGCA 
ATTGAAGAAC TGTTTGCACT GGCAGGAAAA 
GTTACGCCGT TTTACCGCCT GTGTTGGGAG 
CAAACCCGGC TCGAAGCGCA GATTCAGCAG 
CAGTTTCTGG ACTATTCACG CGCGGTGTTT 
CCTTTTTTAT CGTTCAGAGA CATGCTTCGC 
TGGAGAAGCG TTTACAGTAA GGTTGCCAGT 
TTTTCTTTCC ACTCGCTGTT GGTGGGCGGC 
TTGATACACG CGCTGGAGCG TGAGTGGGGC 
TTAGTTCAGG GGATGATAAA GCTGTTTCAG 
AGAGTCAGCC ATATGGAAAC GACAGGAAAC 
CGCAGGTTCC TGACGCAAGC CGTCGCGTCA 
CTGTTAAGCC AGCACCCTGC CGCGGTTAAG 
AGTAACTCTC TGTTTGTGCT CTATTTTGGT 
CACACGGTTT GTTTCGGCCC GCGTTACCGC 
GGCCTCGCAG AGGACTTCTC ACTTTATCTG 
GCGCCTGAAG GTTGCGGCAG TTACTATGTG 
AACCTCGACT GGACGGTTGA GGGGCCAAAA 
CAGCATTACA TGCCTGGCTT ACGGAGTCAG 
GATTTTCGCG ACCAGCTTAA TGCCTATCAT 
ACCCAGAGCG CCTGGTTTCG GCCGCATAAC 
GTCGGCGCAG GCACGCATCC CGGCGCAGGC 
ACAGCAGGTT TGATGCTGGA GGATCTGATT 
TCGAAACGAT GGCAGTTGGC TCGAAAAGTT 
AAACCCGGCG CAGCGTACTG ATGCTCTACG 
ACGATCAGAC GCTGGGCTTT CAGGCCCGGC 
TGATGCAACT TGAGATGAAA ACGCGCCAGG 
CGTTTGCGGC TTTTCAGGAA GTGGCTATGG 
ATCATCTGGA AGGCTTCGCC ATGGATGTAC 
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GCGAAGCGCA ATACAGCCAA CTGGATGATA 
TTGTCGGCTT GATGATGGCG CAAATCATGG 
CCTGTGACCT TGGGCTGGCA TTTCAGTTGA 
CGCATGCGGG CCGCTGTTAT CTGCCGGCAA 
AGAATTATGC GGCACCTGAA AACCGTCAGG 
AGGAAGCAGA ACCTTACTAT TTGTCTGCCA 
CCGCCTGGGC AATCGCTACG GCGAAGCAGG 
AGGCCGGTCA GCAAGCCTGG GATCAGCGGC 
TGCTGCTGGC CGCCTCTGGT CAGGCCCTTA 
CTGCGCATCT CTGGCAGCGC CCGCTCTAGC 
CCCAATTCGC CCTATAGTGA GTCGTATTAC 
CGTGACTGGG AAAACCCTGG CGTTACCCAA 
GCCAGCTGGC GTAATAGCGA AGAGGCCCGC 
CTGAATGGCG AATGGAAATT GTAAGCGTTA 
GTTAAATCAG CTCATTTTTT AACCAATAGG 
AAGAATAGAC CGAGATAGGG TTGAGTGTTG 
AGAACGTGGA CTCCT^CGTC AAAGGGCGAA 
GTGAACCATC ACCCTAATCA AGTTTTTTGG 
ACCCTAAAGG GAGCCCCCGA TTTAGAGCTT 
AGGAAGGGAA GAAAGCGAAA GGAGCGGGCG 
TGCGCGTAAC CACCACACCC GCCGCGCTTA 



CGCTGCGCTA TTGCTATCAC GTTGCAGGCG 
GCGTGCGGGA TAACGCCACG CTGGACCGCG 
CCAATATTGC TCGCGATATT GTGGACGATG 
GCTGGCTGGA GCATGAAGGT CTGAACAAAG 
CGCTGAGCCG TATCGCCCGT CGTTTGGTGC 
CAGCCGGCCT GGCAGGGTTG CCCCTGCGTT 
TTTACCGGAA AATAGGTGTC AAAGTTGAAC 
AGTCAACGAC CACGCCCGAA AAATTAACGC 
CTTCCCGGAT GCGGGCTCAT CCTCCCCGCC 
GCCATGTCGA CCTCGAGGGG GGGCCCGGTA 
GCGCGCTCAC TGGCCGTCGT TTTACAACGT 
CTTAATCGCC TTGCAGCACA TCCCCCTTTC 
ACCGATCGCC CTTCCCAACA GTTGCGCAGC 
ATATTTTGTT AAAATTCGCG TTAAATTTTT 
CCGAAATCGG CAAAATCCCT TATAAATCAA 
TTCCAGTTTG GAACAAGAGT CCACTATTAA 
AAACCGTCTA TCAGGGCGAT GGCCCACTAC 
GGTCGAGGTG CCGTAAAGCA CTAAATCGGA 
GACGGGGAAA GCCGGCGAAC GTGGCGAGAA 
CTAGGGCGCT GGCAAGTGTA GCGGTCACGC 
ATGCGCCGCT ACAGGGCGCG TCAG 
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Annex F 

DNA sequence of the vector construct pACYCcaro14 

ID PACYCCAR014 PRELIMINARY; DNA; 8547 BP. 

SQ SEQUENCE 8547 BP; 1884 A; 2296 C; 2279 G; 2088 T; 0 OTHER; 
GAATTCCGGA TGAGCATTCA TCAGGCGGGC AAGAATGTGA ATAAAGGCCG GATAAAACTT 
GTGCTTATTT TTCTTTACGG TCTTTAAAAA GGCCGTAATA TCCAGCTGAA CGGTCTGGTT 
ATAGGTACAT TGAGCAACTG ACTGAAATGC CTCAAAATGT TCTTTACGAT GCCATTGGGA 
TATATCAACG GTGGTATATC CAGTGATTTT TTTCTCCATT TTAGCTTCCT TAGCTCCTGA 
AAATCTCGAT AACTCAAAAA ATACGCCCGG TAGTGATCTT ATTTCATTAT GGTGAAAGTT 
GGAACCTCTT ACGTGCCGAT CAACGTCTCA TTTTCGCCAA AAGTTGGCCC AGGGCTTCCC 
GGTATCAACA GGGACACCAG GATTTATTTA TTCTGCGAAG TGATCTTCCG TCACAGGTAT 
TTATTCGGCG CAAAGTGCGT CGGGTGATGC TGCCAACTTA CTGATTTAGT GTATGATGGT 
GTTTTTGAGG TGCTCCAGTG GCTTCTGTTT CTATCAGCTG TCCCTCCTGT TCAGCTACTG 
ACGGGGTGGT GCGTAACGGC AAAAGCACCG CCGGACATCA GCGCTAGCGG AGTGTATACT 
GGCTTACTAT GTTGGCACTG ATGAGGGTGT CAGTGAAGTG CTTCATGTGG CAGGAGAAAA 
AAGGCTGCAC CGGTGCGTCA GCAGAATATG TGATACAGGA TATATTCCGC TTCCTCGCTC 
ACTGACTCGC TACGCTCGGT CGTTCGACTG CGGCGAGCGG AAATGGCTTA CGAACGGGGC 
GGAGATTTCC TGGAAGATGC CAGGAAGATA CTTAACAGGG AAGTGAGAGG GCCGCGGCAA 
AGCCGTTTTT CCATAGGCTC CGCCCCCCTG ACAAGCATCA CGAAATCTGA CGCTCAAATC 
AGTGGTGGCG AAACCCGACA GGACTATAAA GATACCAGGC GTTTCCCCCT GGCGGCTCCC 
TCGTGCGCTC TCCTGTTCCT GCCTTTCGGT TTACCGGTGT CATTCCGCTG TTATGGCCGC 
GTTTGTCTCA TTCCACGCCT GACACTCAGT TCCGGGTAGG CAGTTCGCTC CAAGCTGGAC 
TGTATGCACG AACCCCCCGT TCAGTCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT 
GAGTCCAACC CGGAAAGACA TGCAAT^GCA CCACTGGCAG CAGCCACTGG TAATTGATTT 
AGAGGAGTTA GTCTTGAAGT CATGCGCCGG TTAAGGCTAA ACTGAAAGGA CAAGTTTTGG 
TGACTGCGCT CCTCCAAGCC AGTTACCTCG GTTCAAAGAG TTGGTAGCTC AGAGAACCTT 
CGAAAAACCG CCCTGCAAGG CGGTTTTTTC GTTTTCAGAG CAAGAGATTA CGCGCAGACC 
AAAACGATCT CAAGAAGATC ATCTTATTAA TCAGATAAAA TATTT.CTAGA TTTCAGTGCA 
ATTTATCTCT TCAAATGTAG CACCTGAAGT CAGCCCCATA CGATATAAGT TGTAATTCTC 
ATGTTTGACA GCTTATCATC GATAAGCTTT AATGCGGTAG TTTATCACAG TTAAATTGCT 
AACGCAGTCA GGCACCGTGT ATGAAATCTA ACAATGCGCT CATCGTCATC CTCGGCACCG 
TCAGCCTGGA TGCTGTAGGC ATAGGCTTGG TTATGCCGGT ACTGCCGGGC CTCTTGCGGG 
ATATCGTCCA TTCCGACAGC ATCGCCAGTC ACTATGGCGT GCTGCTAGCG CTATATGCGT 
TGATGCAATT TCTATGCGCA CCCGTTCTCG GAGCACTGTC CGACCGCTTT GGCCGCCGCC 
CAGTCCTGCT CGCTTCGCTA CTTGGAGCCA CTATCGACTA CGCGATCATG GCGACCACAC 
CCGTCCTGTG GATCCCCCGG GCTGCAGGAA TTGCCGTAAA TGTATCCGTT TATAAGGACA 
GCCCGAATGA CGGTCTGCGC AAAAAAACAC GTTCATCTCA CTCGCGATGC TGCGGAGCAG 
TTACTGGCTG ATATTGATCG ACGCCTTGAT CAGTTATTGC CCGTGGAGGG AGAACGGGAT 
GTTGTGGGTG CCGCGATGCG TGAAGGTGCG CTGGCACCGG GAAAACGTAT TCGCCCCATG 
TTGCTGTTGC TGACCGCCCG CGATCTGGGT TGCGCTGTCA GCCATGACGG ATTACTGGAT 
TTGGCCTGTG CGGTGGAAAT GGTCCACGCG GCTTCGCTGA TCCTTGACGA TATGCCCTGC 
ATGGACGATG CGAAGCTGCG GCGCGGACGC CCTACCATTC ATTCTCATTA CGGAGAGCAT 
GTGGCAATAC TGGCGGCGGT TGCCTTGCTG AGTAAAGCCT TTGGCGTAAT TGCCGATGCA 
GATGGCCTCA CGCCGCTGGC AAAAAATCGG GCGGTTTCTG AACTGTCAAA CGCCATCGGC 
ATGCAAGGAT TGGTTCAGGG TCAGTTCAAG GATCTGTCTG AAGGGGATAA GCCGCGCAGC 
GCTGAAGCTA TTTTGATGAC GAATCACTTT AAAACCAGCA CGCTGTTTTG TGCCTCCATG 
CAGATGGCCT CGATTGTTGC GAATGCCTCC AGCGAAGCGC GTGATTGCCT GCATCGTTTT 
TCACTTGATC TTGGTCAGGC ATTTCAACTG CTGGACGATT TGACCGATGG CATGACCGAC 
ACCGGTAAGG ATAGCAATCA GGACGCCGGT AAATCGACGC TGGTCAATCT GTTAGGCCCG 
AGGGCGGTTG AAGAACGTCT GAGACAACAT CTTCAGCTTG CCAGTGAGCA TCTCTCTGCG 
GCCTGCCAAC ACGGGCACGC CACTCAACAT TTTATTCAGG CCTGGTTTGA CAAAAAACTC 
GCTGCCGTCA GTTAAGCTTA TGTGCACCGG TCAGCCTGTC TTAAGTGGGA GCGGCTATGC 
AACCGCATTA TGATCTGATT CTCGTGGGGG CTGGACTCGC GAATGGCCTT ATCGCCCTGC 
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GTCTTCAGCA GCAGCAACCT GATATGCGTA 
GCGGGAATCA TACGTGGTCA TTTCACCACG 
TAGCTCCGCT GGTGGTTCAT CACTGGCCCG 
GTAAGCTGAA CAGCGGCTAC TTTTGTATTA 
GACAGTTTGG CCCGCACTTG TGGATGGATA 
TTCGGTTGAA AAAGGGTCAG GTTATCGGTG 
CGGCAAATTC AGCACTGAGC GTGGGCTTCC 
GCCACCCGCA TGGTTTATCG TCTCCCATTA 
GTTATCGCTT CGTGTACAGC CTGCCGCTCT 
ACTATATTGA TAATGCGACA TTAGATCCTG 
CCGCGCAACA GGGTTGGCAG CTTCAGACAC 
TTACTCTGTC GGGCAATGCC GACGCATTCT 
TACGTGCCGG TCTGTTCCAT CCTACCACCG 
CCGACCGCCT GAGTGCACTT GATGTCTTTA 
ATTTTGCCCG CGAGCGCTGG CAGCAGCAGG 
TTTTAGCCGG ACCCGCCGAT TCACGCTGGC 
AAGATTTAAT TGCCCGTTTT TATGCGGGAA 
TGAGCGGCAA GCCGCCTGTT CCGGTATTAG 
GTTAAAGAGC GACTACATGA AACCAACTAC 
ACTGGCAATT CGTCTACAAG CTGCGGGGAT 
ACCCGGCGGT CGGGCTTATG TCTACGAGGA 
GGTTATCACC GATCCCAGTG CCATTGAAGA 
AGAGTATGTC GAACTGCTGC CGGTTACGCC 
GGTCTTTAAT TACGATAACG ATCAAACCCG 
CCGCGATGTC GAAGGTTATC GTCAGTTTCT 
CTATCTAAAG CTCGGTACTG TCCCTTTTTT 
TCAACTGGCG AAACTGCAGG CATGGAGAAG 
AGATGAACT^T CTGCGCCAGG CGTTTTCTTT 
CGCCACCTCA TCCATTTATA CGTTGATACA 
TCCGCGTGGC GGCACCGGCG CATTAGTTCA 
TGGCGAAGTC GTGTTAAACG CCAGAGTCAG 
AGCCGTGCAT TTAGAGGACG GTCGCAGGTT 
TGTGGTTCAT ACCTATCGCG ACCTGTTAAG 
CAAACTGCAG ACTAAGCGCA TGAGTAACTC 
CCATCATGAT CAGCTCGCGC ATCACACGGT 
TGACGAAATT TTTAATCATG ATGGCCTCGC 
CTGTGTCACG GATTCGTCAC TGGCGCCTGA 
GGTGCCGCAT TTAGGCACCG CGAACCTCGA 
CCGTATTTTT GCGTACCTTG AGCAGCATTA 
GCACCGGATG TTTACGCCGT TTGATTTTCG 
CTTTTCTGTG GAGCCCGTTC TTACCCAGAG 
AACCATTACT AATCTCTACC TGGTCGGCGC 
CGTCATCGGC TCGGCAAAAG CGACAGCAGG 
CCGTCGTTAC TCAATCATGC GGTCGAAACG 
GCCTCAAAGT TATTTGATGC AAAAACCCGG 
CGCCATTGTG ACGATGTTAT TGACGATCAG 
TTACAAACGC CCGAACAACG TCTGATGCAA 
GGATCGCAGA TGCACGAACC GGCGTTTGCG 
ATCGCCCCGG CTTACGCGTT TGATCATCTG 
CAATACAGCC AACTGGATGA TACGCTGCGC 
TTGATGATGG CGCAAATCAT GGGCGTGCGG 
CTTGGGCTGG CATTTCAGTT GACCAATATT 
GGCCGCTGTT ATCTGCCGGC AAGCTGGCTG 
GCGGCACCTG AAAACCGTCA GGCGCTGAGC 
GAACCTTACT ATTTGTCTGC CACAGCCGGC 



TTTTGCTTAT CGACGCCGCA CCCCAGGCGG 
ATGATTTGAC TGAGAGCCAA CATCGTTGGA 
ACTATCAGGT ACGCTTTCCC ACACGCCGTC 
CTTCTCAGCG TTTCGCTGAG GTTTTACAGC 
CCGCGGTCGC AGAGGTTAAT GCGGAATCTG 
CCCGCGCGGT GATTGACGGG CGGGGTTATG 
AGGCGTTTAT TGGCCAGGAA TGGCGATTGA 
TCATGGATGC CACGGTCGAT CAGCAAAATG 
CGCCGACCAG ATTGTTAATT GAAGACACGC 
AATGCGCGCG GCAAAATATT TGCGACTATG 
TGCTGCGAGA AGAACAGGGC GCCTTACCCA 
GGCAGCAGCG CCCCCTGGCC TGTAGTGGAT 
GCTATTCACT GCCGCTGGCG GTTGCCGTGG 
CGTCGGCCTC AATTCACCAT GCCATTACGC 
GCTTTTTCCG CATGCTGAAT CGCATGCTGT 
GGGTTATGCA GCGTTTTTAT GGTTTACCTG 
AACTCACGCT GACCGATCGG CTACGTATTC 
CAGCATTGCA AGCCATTATG ACGACTCATC 
GGTAATTGGT GCAGGCTTCG GTGGCCTGGC 
CCCCGTCTTA CTGCTTGAAC AACGTGATAA 
TCAGGGGTTT ACCTTTGATG CAGGCCCGAC 
ACTGTTTGCA CTGGCAGGAA AACAGTTAAA 
GTTTTACCGC CTGTGTTGGG AGTCAGGGAA 
GCTCGAAGCG CAGATTCAGC AGTTTAATCC 
GGACTATTCA CGCGCGGTGT TTAAAGAAGG 
ATCGTTCAGA GACATGCTTC GCGCCGCACC 
CGTTTACAGT AAGGTTGCCA GTTACATCGA 
CCACTCGCTG TTGGTGGGCG GCAATCCCTT 
CGCGCTGGAG CGTGAGTGGG GCGTCTGGTT 
GGGGATGATA T^GCTGTTTC AGGATCTGGG 
CCATATGGAA ACGACAGGAA ACAAGATTGA 
CCTGACGCAA GCCGTCGCGT CAAATGCAGA 
CCAGCACCCT GCCGCGGTTA AGCAGTCCAA 
TCTGTTTGTG CTCTATTTTG GTTTGAATCA 
TTGTTTCGGC CCGCGTTACC GCGAGCTGAT 
AGAGGACTTC TCACTTTATC TGCACGCGCC 
AGGTTGCGGC AGTTACTATG TGTTGGCGCC 
CTGGACGGTT GAGGGGCCAA AACTACGCGA 
CATGCCTGGC TTACGGAGTC AGCTGGTCAC 
CGACCAGCTT AATGCCTATC ATGGCTCAGC 
CGCCTGGTTT CGGCCGCATA ACCGCGATAA 
AGGCACGCAT CCCGGCGCAG GCATTCCTGG 
TTTGATGCTG GAGGATCTGA TTTGAATAAT 
ATGGCAGTTG GCTCGAAAAG TTTTGCGACA 
CGCAGCGTAC TGATGCTCTA CGCCTGGTGC 
ACGCTGGGCT TTCAGGCCCG GCAGCCTGCC 
CTTGAGATGA AAACGCGCCA GGCCTATGCA 
GCTTTTCAGG AAGTGGCTAT GGCTCATGAT 
GAAGGCTTCG CCATGGATGT ACGCGAAGCG 
TATTGCTATC ACGTTGCAGG CGTTGTCGGC 
GATAACGCCA CGCTGGACCG CGCCTGTGAC 
GCTCGCGATA TTGTGGACGA TGCGCATGCG 
GAGCATGAAG GTCTGAACAA AGAGAATTAT 
CGTATCGCCC GTCGTTTGGT GCAGGAAGCA 
CTGGCAGGGT TGCCCCTGCG TTCCGCCTGG 
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GCAATCGCTA CGGCGAAGCA GGTTTACCGG 
CAGCAAGCCT GGGATCAGCG GCAGTCAACG 
GCCGCCTCTG GTCAGGCCCT TACTTCCCGG 
CTCTGGCAGC GCCCGCTCTA GCGCCATGTC 
TCAGCTCCTT CCGGTGGGCG CGGGGCATGA 
TTATCATGCA ACTCGTAGGA CAGGTGCCGG 
GCTTTCGCTG GAGCGCGACG ATGATCGGCC 
CCCTCGCTCA AGCCTTCGTC ACTGGTCCCG 
TTATCGCCGG CATGGCGGCC GACGCGCTGG 
GCTGGATGGC CTTCCCCATT ATGATTCTTC 
TGCAGGCCAT GCTGTCCAGG CAGGTAGATG 
TCGCGGCTCT TACCAGCCTA ACTTCGATCA 
CCGCCTCGGC GAGCACATGG AACGGGTTGG 
TCTGCCTCCC CGCGTTGCGT CGCGGTGCAT 
CCGGCGGCAC CTCGCTAACG GATTCACCAC 
GAGAACTGTG AATGCGCAAA CCAACCCTTG 
CAGCAGCCGC ACGCGGCGCA TCTCGGGCAG 
CGTGCTCCTG TCGTTGAGGA CCCGGCTAGG 
TGAATCACCG ATACGCGAGC GAACGTGAAG 
AGCAACAACA TGAATGGTCT TCGGTTTCCG 
CCCTACGTGC TGCTGAAGTT GCCCGCAACA 
ACTATGACTG AGAGTCAACG CCATGAGCGG 
CGCACCGCTG TCCGGTAGCT CCTTCCGGTG 
TATGACTGTC TTCTTTATCA TGCAACTCGT 
CCCGGCCACG GGGCCTGCCA CCATACCCAC 
CCGGATCTGC ATCGCAGGAT GCTGCTGGCT 
GAAGCGCTAA CCGTTTTTAT CAGGCTCTGG 
ATTACCTCCA CGGGGAGAGC CTGAGCAAAC 
ACACTGCTTC CGGTAGTCAA TAAACCGGTA 
CGACCCTGCC CTGAACCGAC GACCGGGTCG 
GCTTATTATC ACTTATTCAG GCGTAGCACC 
AAAAAATTAC GCCCCGCCCT GCCACTCATC 
GCCGACATGG AAGCCATCAC AGACGGCATG 
CTTGTCGCCT TGCGTATAAT ATTTGCCCAT 
ATTGGCCACG TTTAAATCAA AACTGGTGAA 
CATATTCTCA ATAAACCCTT TAGGGAAATA 
TTGCGAATAT ATGTGTAGAA ACTGCCGGAA 
AAACGTTTCA GTTTGCTCAT GGAAAACGGT 
CAGCTCACCG TCTTTCATTG CCATACG 



AAAATAGGTG TCAAAGTTGA ACAGGCCGGT 
ACCACGCCCG AAAAATTAAC GCTGCTGCTG 
ATGCGGGCTC ATCCTCCCCG CCCTGCGCAT 
GACCGATGCC CTTGAGAGCC TTCAACCCAG 
CTATCGTCGC CGCACTTATG ACTGTCTTCT 
CAGCGCTCTG GGTCATTTTC GGCGAGGACC 
TGTCGCTTGC GGTATTCGGA ATCTTGCACG 
CCACCAAACG TTTCGGCGAG AAGCAGGCCA 
GCTACGTCTT GCTGGCGTTC GCGACGCGAG 
TCGCTTCCGG CGGCATCGGG ATGCCCGCGT 
ACGACCATCA GGGACAGCTT CAAGGATCGC 
CTGGACCGCT GATCGTCACG GCGATTTATG 
CATGGATTGT AGGCGCCGCC CTATACCTTG 
GGAGCCGGGC CACCTCGACC TGAATGGAAG 
TCCAAGAATT GGAGCCAATC AATTCTTGCG 
GCAGAACATA TCCATCGCGT CCGCCATCTC 
CGTTGGGTCC TGGCCACGGG TGCGCATGAT 
CTGGCGGGGT TGCCTTACTG GTTAGCAGAA 
CGACTGCTGC TGCAAAACGT CTGCGACCTG 
TGTTTCGTAA AGTCTGGAAA CGCGGAAGTC 
GAGAGTGGAA CCAACCGGTG ATACCACGAT 
CCTCATTTCT TATTCTGAGT TACAACAGTC 
GGCGCGGGGC ATGACTATCG TCGCCGCACT 
AGGACAGGTG CCGGCAGCGC CCAACAGTCC 
GCCGAAACAA GCGCCCTGCA CCATTATGTT 
ACCCTGTGGA ACACCTACAT CTGTATTAAC 
GAGGCAGAAT AAATGATCAT ATCGTCJ\ATT 
TGGCCTCAGG CATTTGAGAA GCACACGGTC 
AACCAGCAAT AGACATAAGC GGCTATTTAA 
AATTTGCTTT CGAATTTCTG CCATTCATCC 
AGGCGTTTAA GGGCACCAAT AACTGCCTTA 
GCAGTACTGT TGTAATTCAT TAAGCATTCT 
ATGAACCTGA ATCGCCAGCG GCATCAGCAC 
GGTGATVAACG GGGGCGAAGA AGf TGTCCAT 
ACTCACCCAG GGATTGGCTG AGACGAAAAA 
GGCCAGGTTT TCACCGTAAC ACGCCACATC 
ATCGTCGTGG TATTCACTCC AGAGCGATGA 
GTAACAAGGG TGAACACTAT CCCATATCAC 
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Annex G 

DNA and corresponding amino acid sequence of the ispG (formely gcpE) gene of 

Escherichia coli 

10 20 30 40 50 60 

I I I I I I 

atgcataaccaggctccaattcaacgtagaaaatcaacacgtatttacgttgggaatgtg 
mhnqapiqrrkstriyvgn'v 

70 80 90 100 110 120 

I I I I I I 

ccgattggcgatggtgctcccatcgccgtacagtccatgaccaatacgcgtacgacagac 
pigdgapiavqsmtntrttd 

130 140 150 160 170 180 

I I I I I I 

gtcgaagcaacggtcaatcaaatcaaggcgctggaacgcgttggcgctgatatcgtccgt 
veatvnqikalervgad ivr 

190 200 210 220 230 240 

I I I I i i 

gtatccgtaccgacgatggacgcggcagaagcgttcaaactcatcaaacagcaggttaac 
vsvptmdaaeafklikqqvn 

250 260 270 280 290 300 

I I I I i I 

gtgccgctggtggctgacatccacttcgactatcgcattgcgctgaaagtagcggaatac 
vplvadihfdyrialkvaey 

310 320 330 340 350 360 

I I I I I I 

ggcgtcgattgtctgcgtattaaccctggcaatatcggtaatgaagagcgtattcgcatg 
gvdclrinpgnigneerirm 

370 380 390 400 410 420 

I I I I I I 

gtggttgactgtgcgcgcgataaaaacattccgatccgtattggcgttaacgccggatcg 
vvdcardknipirigvnags 

430 440 450 460 470 480 

I I I I I I 

ctggaaaaagatctgcaagaaaagtatggcgaaccgacgccgcaggcgttgctggaatct 
lekdlqekygeptpqaiiles 

490 500 510 520 530 540 

I I I I I I 

gccatgcgtcatgttgatcatctcgatcgcctgaacttcgatcagttcaaagtcagcgtg 
amrhvdhldriinfdqfkvsv 
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550 560 570 580 590 600 

i I I I I I 

AAAGCGTCTGACGTCTTCCTCGCTGTTGAGTCTTATCGTTTGCTGGCAAAACAGATCGAT 

KASDVFLAVESYRLLAKQID 

610 620 630 640 650 660 

I I I I I I 

CAGCCGTTGCATCTGGGGATCACCGAAGCCGGTGGTGCGCGCAGCGGGGCAGTAAAATCC 
QPLHLGITEAGGARSGAV KS 

670 680 690 700 710 720 

I I I I I I 

GCCATTGGTTTAGGTCTGCTGCTGTCTGAAGGCATCGGCGACACGCTGCGCGTATCGCTG 
AIGLGLLIiSEGIGDTLRVSL 

730 740 750 760 770 780 

I I I I I I 

GCGGCCGATCCGGTCGAAGAGATCAAAGTCGGTTTCGATATTTTGAAATCGCTGCGTATC 

AADPVEEIKVGFDI LKSLRI 

790 800 810 820 830 840 

I I I I I I 

CGTTCGCGAGGGATCAACTTCATCGCCTGCCCGACCTGTTCGCGTCAGGAATTTGATGTT 
RSRGINFIACPTCSRQEFDV 

850 860 870 880 890 900 

I I I I I I 

ATCGGTACGGTTAACGCGCTGGAGCAACGCCTGGAAGATATCATCACTCCGATGGACGTT 
IGTVNAIiEQRIiEDI ITPMDV 

910 920 930 940 950 960 

I I I I I I 

TCGATTATCGGCTGCGTGGTGAATGGCCCAGGTGAGGCGCTGGTTTCTACACTCGGCGTC 
SIIGCVVNGPGEAIiVSTLGV 

970 980 990 1000 1010 1020 

I I I I I I 

ACCGGCGGCAACAAGAT^AAGCGGCCTCTATGAAGATGGCGTGCGCAAAGACCGTCTGGAC 
TGGNKKSGLYEDGVRKDRLD 

1030 1040 1050 1060 1070 1080 

I I I . i I I 

AACAACGATATGATCGACCAGCTGGAAGCACGCATTCGTGCGAAAGCCAGTCAGCTGGAC 
NNDMlDQLEARIRAKASQIiD 

1090 1100 1110 

I I I 

GAAGCGCGTCGAATTGACGTTCAGCAGGTTGAAAAATAA 
EARRIDVQQVEK- 
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Annex H 

DNA sequence of the vector construct pBScyclogcpE 



PBSCYCLOG PRELIMINARY; DNA; 8823 BP. 

SEQUENCE 8823 BP; 2123 A; 2169 C; 2468 G; 2063 T; 0 OTHER; 

GTGGCACTTT TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT 
CAAATATGTA TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA 
GGAAGAGTAT GAGTATTCAA CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT 
GCCTTCCTGT TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT 
TGGGTGCACG AGTGGGTTAC ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT 
TTCGCCCCGA AGAACGTTTT CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG 
TATTATCCCG TATTGACGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA 
ATGACTTGGT TGAGTACTCA CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA 
GAGAATTATG CAGTGCTGCC ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA 
CAACGATCGG AGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA 
CTCGCCTTGA TCGTTGGGAA CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA 
CCACGATGCC TGTAGCAATG GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA 
CTCTAGCTTC CCGGCAACAA TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC 
TTCTGCGCTC GGCCCTTCCG GCTGGCTGGT TTATTGCTGA TJ\AATCTGGA GCCGGTGAGC 
GTGGGTCTCG CGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG 
TTATCTACAC GACGGGGAGT CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA 
TAGGTGCCTC ACTGATTAAG CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT 
AGATTGATTT AAAACTTCAT TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA 
ATCTCATGAC CAAAATCCCT TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG 
AAAAGATCAA AGGATCTTCT TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCT^AA 
CAAAAAAACC ACCGCTACCA GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT 
TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC 
CGTAGTTAGG CCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 
TCCTGTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA 
GACGATAGTT ACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC 
CCAGCTTGGA GCGAACGACC TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA 
GCGCCACGCT TCCCGAAGGG AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA 
CAGGAGAGCG CACGAGGGAG CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG 
GGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC 
TATGGAAAAA CGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG 
CTCACATGTT CTTTCCTGCG TTATCCCCTG ATTCTGTGGA TT^CCGTATT ACCGCCTTTG 
AGTGAGCTGA TACCGCTCGC CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG 
AAGCGGAAGA GCGCCCAATA CGCAi\ACCGC CTCTCCCCGC GCGTTGGCCG ATTCATTAAT 
GCAGCTGGCA CGACAGGTTT CCCGACTGGA AAGCGGGCAG TGAGCGCAAC GCAATTAATG 
TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC CATGATTACG 
CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT GGAGCTCCAC CGCGGGAGGA 
GAAATTAACC ATGCATAACC AGGCTCCAAT TCAACGTAGA AAATCAACAC GTATTTACGT 
TGGGAATGTG CCGATTGGCG ATGGTGCTCC CATCGCCGTA CAGTCCATGA CCAATACGCG 
TACGACAGAC GTCGAAGCAA CGGTCAATCA AATCAAGGCG CTGGAACGCG TTGGCGCTGA 
TATCGTCCGT GTATCCGTAC CGACGATGGA CGCGGCAGAA GCGTTCAAAC TCATCAAACA 
GCAGGTTAAC GTGCCGCTGG TGGCTGACAT CCACTTCGAC TATCGCATTG CGCTGAAAGT 
AGCGGAATAC GGCGTCGATT GTCTGCGTAT TAACCCTGGC AATATCGGTA ATGAAGAGCG 
TATTCGCATG GTGGTTGACT GTGCGCGCGA TAAAAACATT CCGATCCGTA TTGGCGTTAA 
CGCCGGATCG CTGGAAAAAG ATCTGCAAGA AAAGTATGGC GAACCGACGC CGCAGGCGTT 
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GCTGGAATCT GCCATGCGTC ATGTTGATCA 
AGTCAGCGTG AAAGCGTCTG ACGTCTTCCT 
ACAGATCGAT CAGCCGTTGC ATCTGGGGAT 
AGTAAAATCC GCCATTGGTT TAGGTCTGCT 
CGTATCGCTG GCGGCCGATC CGGTCGAAGA 
GCTGCGTATC CGTTCGCGAG GGATCAACTT 
ATTTGATGTT ATCGGTACGG TTAACGCGCT 
GATGGACGTT TCGATTATCG GCTGCGTGGT 
ACTCGGCGTC ACCGGCGGCA ACAAGAAAAG 
CCGTCTGGAC AACAACGATA TGATCGACCA 
TCAGCTGGAC GAAGCGCGTC GAATTGACGT 
AGAACTAGTG GATCCCCCGG GCTGCAGGAA 
ATAGATCTTG GCACCTCGGG CGTAAAAGTT 
GCTGCGCAAA CGGAAAAGCT GACCGTTTCG 
CCGGAACAGT GGTGGCAGGC J\ACTGATCGC 
CTGCAGGACG TTAAAGCATT GGGTATTGCC 
GCTCAGCAAC GGGTGTTACG CCCTGCCATT 
TGCACTTTGC TGGAAGCGCG AGTTCCGCAA 
CCCGGATTTA CTGCGCCTJ^A ATTGCTATGG 
CAAATCGACA AAGTATTATT ACCGAAAGAT 
GCCAGCGATA TGTCTGACGC AGCTGGCACC 
AGTGACGTCA TGCTGCAGGC TTGCGACTTA 
GGCAGCGAAA TTACTGGTGC TTTGTTACCT 
GTGCCAGTTG TCGCAGGCGG TGGCGACAAT 
GATGCTAATC AGGCAATGTT ATCGCTGGGG 
GGGTTCTTAA GCAAGCCAGA AAGCGCCGTA 
TGGCATTTAA TGTCTGTGAT GCTGAGTGCA 
ACCGGCCTGA GCAATGTCCC AGCTTTAATC 
GAGCCAGTTT GGTTTCTGCC TTATCTTTCC 
GCGAAGGGGG TTTTCTTTGG TTTGACTCAT 
GTGCTGGAAG GCGTGGGTTA TGCGCTGGCA 
ATTAAACCGC AAAGTGTTAC GTTGATTGGG 
ATGCTGGCGG ATATCAGCGG TCAGCAGCTC 
GCACTGGGCG CAGCAAGGCT GGCGCAGATC 
TTGTTGCCGC AACTACCGTT AGAACAGTCG 
TATCAGCCAC GACGAGAAAC GTTCCGTCGC 
TAAAAGCTTG AGGAGAAATT AACCATGAAG 
ATTGGTTGCA GCACGCTGGA CGTGGTGCGC 
CTGGTGGCAG GCAAAAATGT CACTCGCATG 
TATGCCGTAA TGGACGATGA AGCGAGTGCG 
GGTAGCCGCA CCGAAGTCTT AAGTGGGCAA 
GATGTTGATC AGGTGATGGC AGCCATTGTT 
GCGATCCGCG CGGGTAAAAC CATTTTGCTG 
CGTCTGTTTA TGGACGCCGT AAAGCAGAGC 
CATAACGCCA TTTTTCAGAG TTTACCGCAA 
CTTGAGCAAA ATGGCGTGGT GTCCATTTTA 
ACGCCATTGC GCGATTTGGC AACAATGACG 
TCGATGGGGC GTAAAATTTC TGTCGATTCG 
ATTGAAGCGC GTTGGCTGTT TAACGCCAGC 
CAGTCAGTGA TTCACTCAAT GGTGCGCTAT 
GAACCGGATA TGCGTACGCC AATTGCCCAC 
GGCGTGAAGC CGCTCGATTT TTGCAAACTA 
GATCGTTATC CATGCCTGAA ACTGGCGATG 
ACAGCATTGA ATGCCGCAAA CGAAATCACC 
TTTACGGATA TCGCTGCGTT GAATTTATCC 



TCTCGATCGC CTGAACTTCG ATCAGTTCAA 
CGCTGTTGAG TCTTATCGTT TGCTGGCAAA 
CACCGAAGCC GGTGGTGCGC GCAGCGGGGC 
GCTGTCTGAA GGCATCGGCG ACACGCTGCG 
GATCAAAGTC GGTTTCGATA TTTTGAAATC 
CATCGCCTGC CCGACCTGTT CGCGTCAGGA 
GGAGCAACGC CTGGT^GATA TCATCACTCC 
GAATGGCCCA GGTGAGGCGC TGGTTTCTAC 
CGGCCTCTAT GAAGATGGCG TGCGCAAAGA 
GCTGGAAGCA CGCATTCGTG CGAAAGCCAG 
TCAGCAGGTT GAAAAATAAG CGGCCGCTCT 
TTCGAGGAGA AATTAACCAT GTATATCGGG 
ATTTTGCTCA ACGAGCAGGG TGAGGTGGTT 
CGCCCGCATC CACTCTGGTC GGAACAAGAC 
GCAATGAAAG CTCTGGGCGA TCAGCATTCT 
GGCCAGATGC ACGGAGCAAC CTTGCTGGAT 
TTGTGGAACG ACGGGCGCTG TGCGCAAGAG 
TCGCGGGTGA TTACCGGCAA CCTGATGATG 
GTTCAGCGGC ATGAGCCGGA GATATTCCGT 
TACTTGCGTC TGCGTATGAC GGGGGAGTTT 
ATGTGGCTGG ATGTCGCAAA GCGTGACTGG 
TCTCGTGACC AGATGCCCGC ATTATACGAA 
GAAGTTGCGA AAGCGTGGGG TATGGCGACG 
GCAGCTGGTG CAGTTGGTGT GGGAATGGTT 
ACGTCGGGGG TCTATTTTGC TGTCAGCGAA 
CATAGCTTTT GCCATGCGCT ACCGCAACGT 
GCGTCGTGTC TGGATTGGGC CGCGAAATTA 
GCTGCAGCTC AACAGGCTGA TGAAAGTGCC 
GGCGAGCGTA CGCCACACAA TAATCCCCAG 
CAACATGGCC CCAATGAACT GGCGCGAGCA 
GATGGCATGG ATGTCGTGCA TGCCTGCGGT 
GGCGGGGCGC GTAGTGAGTA CTGGCGTCAG 
GATTACCGTA CGGGGGGGGA TGTGGGGCCA 
GCGGCGAATC CAGAGAAATC GCTCATTGAA 
CATCTACCAG ATGCGCAGCG TTATGCCGCT 
CTCTATCAGC AACTTCTGCC ATTT^TGGCG 
CAACTCACCA TTCTGGGCTC GACCGGCTCG 
CATAATCCCG AACACTTCCG CGTAGTTGCG 
GTAGAACAGT GCCTGGAATT CTCTCCCCGC 
AAACTTCTTA AAACGATGCT ACAGCAACAG 
CAAGCCGCTT GCGATATGGC AGCGCTTGAG 
GGCGCTGCTG GGCTGTTACC TACGCTTGCT 
GCCAATAAAG AATCACTGGT TACCTGCGGA 
AAAGCGCAAT TGTTACCGGT CGATAGCGAA 
CCTATCCAGC ATAATCTGGG ATACGCTGAC 
CTTACCGGGT CTGGTGGCCC TTTCCGTGAG 
CCGGATCAAG CCTGCCGTCA TCCGAACTGG 
GCTACCATGA TGAACAAAGG TCTGGAATAC 
GCCAGCCAGA TGGAAGTGCT GATTCACCCG 
CAGGACGGCA GTGTTCTGGC GCAGCTGGGG 
ACCATGGCAT GGCCGAATCG CGTGAACTCT 
AGTGCGTTGA CATTTGCCGC ACCGGATTAT 
GAGGCGTTCG AACAAGGCCA GGCAGCGAGG 
GTTGCTGCTT TTCTTGCGCA ACAAATCCGC 
GTACTGGAAA AAATGGATAT GCGCGAACCA 
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CAATGTGTGG ACGATGTGTT ATCTGTTGAT 
GTGATGCGTC TCGCAAGCTG AGTCGACGAG 
GATGTTTGCG CCGTGGTTCC GGCGGCCGGA 
AAGCAATATC TCTCAATCGG TAATCAAACC 
GCGCATCCCC GGGTGAAACG TGTCGTCATT 
CAACTTCCTC TGGCGAATCA TCCGCA7UVTC 
GATTCCGTGC TGGCAGGTCT GAAAGCCGCT 
GCCGCTCGTC CTTGTTTGCA TCAGGATGAC 
AGCCGCACGG GGGGGATCCT CGCCGCACCA 
GGCAAAAATG CCATTGCTCA TACCGTTGAT 
CAATTTTTCC CTCGTGAGCT GTTACATGAC 
ACTATTACCG ACGAAGCCTC GGCGCTGGAA 
GGCCGTGCGG ATAACATTAA AGTCACGCGC 
CTCACCCGAA CCATCCATCA GGAGAATACA 
ATGCCTTTGG CGGTGAAGGC CCAATTATCA 
GATTGCTGGC GCATTCTGAT GGCGACGTGG 
GCGCGGCGGC GCTGGGGGAT ATCGGCAAGC 
GTGCCGATAG CCGCGAGCTG CTACGCGAAG 
CCCTTGGCAA CGTCGATGTC ACTATCATCG 
CACAAATGCG CGTGTTTATT GCCGAAGATC 
AAGCCACTAC TACGGAAAAA CTGGGATTTA 
CGGTGGCGCT ACTCATTAAG GCAACAAAAT 
CACAGTGGCC CTCTCCGGCA AAACTTAATC 
ATGGTTACCA CACGCTGCAA ACGCTGTTTC 
TTGAGCTTCG TGACGATGGG GATATTCGTC 
AAGATAACCT GATCGTTCGC GCAGCGCGAT 
GTCTTCCGAC GGGAAGCGGT GCGAATATCA 
GTCTCGGCGG TGGTTCATCC AATGCCGCGA 
AATGCGGGCT AAGCATGGAT GAGCTGGCGG 
CTGTCTTTGT TCGGGGGCAT GCCGCGTTTG 
TGGATCCGCC AGAGAAGTGG TATCTGGTGG 
TGATTTTTAA AGATCCTGAA CTCCCGCGCA 
TAAAATGTGA ATTCAGCAAT GATTGCGAGG 
ATGCGGTGCT TTCCTGGCTG TTAGAATACG 
GTGTCTTTGC TGAATTTGAT ACAGAGTCTG 
AATGGCTCAA TGGCTTTGTG GCGAAAGGCG 
TTTAAGGTAC CCAATTCGCC CTATAGTGAG 
TTACAACGTC GTGACTGGGA AAACCCTGGC 
CCCCCTTTCG CCAGCTGGCG TAATAGCGAA 
TTGCGCAGCC TGAATGGCGA ATGGAAATTG 
TAAATTTTTG TTAAATCAGC TCATTTTTTA 
ATAAATCAAA AGAATAGACC GAGATAGGGT 
CACTATTAAA GAACGTGGAC TCCAACGTCA 
GCCCACTACG TGAACCATCA CCCTAATCAA 
TAAATCGGAA CCCTAAAGGG AGCCCCCGAT 
TGGCGAGAAA GGAAGGGAAG AAAGCGAAAG 
CGGTCACGCT GCGCGTAACC ACCACACCCG 
CAG 



GCGAACGCGC GTGAAGTCGC CAGAAAAGAG 
GAGAAATTAA CCATGGCAAC CACTCATTTG 
TTTGGCCGTC GAATGCAAAC GGAATGTCCT 
ATTCTTGAAC ACTCGGTGCA TGCGCTGCTG 
GCCATAAGTC CTGGCGATAG CCGTTTTGCA 
ACCGTTGTAG ATGGCGGTGA TGAGCGTGCC 
GGCGACGCGC AGTGGGTATT GGTGCATGAC 
CTCGCGCGAT TGTTGGCGTT GAGCGAAACC 
GTGCGCGATA CTATGAAACG TGCCGAACCG 
CGCAACGGCT TATGGCACGC GCTGACGCCG 
TGTCTGACGC GCGCTCTAAA TGAAGGCGCG 
TATTGCGGAT TCCATCCTCA GTTGGTCGAA 
CCGGAAGATT TGGCACTGGC CGAGTTTTAC 
TAATGCGAAT TGGACACGGT TTTGACGTAC 
TTGGTGGCGT ACGCATTCCT TACGAAAAAG 
CGCTCCATGC GTTGACCGAT GCATTGCTTG 
TGTTCCCGGA TACCGATCCG GCATTTAAAG 
CCTGGCGTCG TATTCAGGCG AAGGGTTATA 
CTCAGGCACC GAAGATGTTG CCGCACATTC 
TCGGCTGCCA TATGGATGAT GTTAACGTGA 
CCGGACGTGG GGAAGGGATT GCCTGTGAAG 
GACTCGAGGA GGAGAAATTA ACCATGCGGA 
TGTTTTTATA CATTACCGGT CAGCGTGCGG 
AGTTTCTTGA TTACGGCGAC ACCATCAGCA 
TGTTAACGCC CGTTGAAGGC GTGGAACATG 
TGTTGATGTA AACTGCGGCA GACAGCGGGC 
GCATTGACAA GCGTTTGCCG ATGGGCGGCG 
CGGTCCTGGT GGCATTAAAT CATCTCTGGC 
AAATGGGGCT GACGCTGGGC GCAGATGTTC 
CCGAAGGCGT TGGTGAAATA CTAACGCCGG 
CGCACCCTGG TGTiEUlGTATT CCGACTCCGG 
ATACGCCAAA AAGGTCAATA GAAACGTTGC 
TTATCGC/^G AAAACGTTTT CGCGAGGTTG 
CCCCGTCGCG CCTGACTGGG ACAGGGGCCT 
AAGCCCGCCA GGTGCTAGAG CAAGCCCCGG 
CTAATCTTTC CCCATTGCAC AGAGCCATGC 
TCGTATTACG CGCGCTCACT GGCCGTCGTT 
GTTACCCAAC TTAATCGCCT TGCAGCACAT 
GAGGCCCGCA CCGATCGCCC TTCCCAACAG 
TAAGCGTTAA TATTTTGTTA AAATTCGCGT 
ACCAATAGGC CGAAATCGGC AAAATCCCTT 
TGAGTGTTGT TCCAGTTTGG AACAAGAGTC 
AAGGGCGAAA AACCGTCTAT CAGGGCGATG 
GTTTTTTGGG GTCGAGGTGC CGTAAAGCAC 
TTAGAGCTTG ACGGGGAAAG CCGGCGAACG 
GAGCGGGCGC TAGGGCGCTG GCAAGTGTAG 
CCGCGCTTAA TGCGCCGCTA CAGGGCGCGT 
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Annex I 

DNA sequence of the vectorconstruct pACYCIytBgcpE 

ID PACYCGCLY PRELIMINARY; DNA; 5 793 BP. 

SQ SEQUENCE 5793 BP; 1375 A; 1506 C; 1543 G; 1369 T; 0 OTHER; 
GAATTCCGGA TGAGCATTCA TCAGGCGGGC AAGAATGTGA ATAAAGGCCG GATAAAACTT 
GTGCTTATTT TTCTTTACGG TCTTTJ\AAAA GGCCGTAATA TCCAGCTGAA CGGTCTGGTT 
ATAGGTACAT TGAGCAACTG ACTGAAATGC CTCAAAATGT TCTTTACGAT GCCATTGGGA 
TATATCAACG GTGGTATATC CAGTGATTTT TTTCTCCATT TTAGCTTCCT TAGCTCCTGA 
AAATCTCGAT AACTCAAAAA ATACGCCCGG TAGTGATCTT ATTTCATTAT GGTGAAAGTT 
GGAACCTCTT ACGTGCCGAT CAACGTCTCA TTTTCGCCAA AAGTTGGCCC A6GGCTTCCC 
GGTATCAACA GGGACACCAG GATTTATTTA TTCTGCGAAG TGATCTTCCG TCACAGGTAT 
TTATTCGGCG CAAAGTGCGT CGGGTGATGC TGCCAACTTA CTGATTTAGT GTATGATGGT 
GTTTTTGAGG TGCTCCAGTG GCTTCTGTTT CTATCAGCTG TCCCTCCTGT TCAGCTACTG 
ACGGGGTGGT GCGTAACGGC AAAAGCACCG CCGGACATCA GCGCTAGCGG AGTGTATACT 
GGCTTACTAT GTTGGCACTG ATGAGGGTGT CAGTGAAGTG CTTCATGTGG CAGGAGAAAA 
AAGGCTGCAC CGGTGCGTCA GCAGAATATG TGATACAGGA TATATTCCGC TTCCTCGCTC 
ACTGACTCGC TACGCTCGGT CGTTCGACTG CGGCGAGCGG AAATGGCTTA CGAACGGGGC 
GGAGATTTCC TGGAAGATGC CAGGAAGATA CTTAACAGGG AAGTGAGAGG GCCGCGGCAA 
AGCCGTTTTT CCATAGGCTC CGCCCCCCTG ACAAGCATGA CGAAATCTGA CGCTCAAATC 
AGTGGTGGCG AAACCCGACA GGACTATAAA GATACCAGGC GTTTCCCCCT GGCGGCTCCC 
TCGTGCGCTC TCCTGTTCCT GCCTTTCGGT TTACCGGTGT CATTCCGCTG TTATGGCCGC 
GTTTGTCTCA TTCCACGCCT GACACTCAGT TCCGGGTAGG CAGTTCGCTC CAAGCTGGAC 
TGTATGCACG AACCCCCCGT TCAGTCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT 
GAGTCCAACC CGGAAAGACA TGCAAAAGCA CCACTGGCAG CAGCCACTGG TAATTGATTT 
AGAGGAGTTA GTCTTGAAGT CATGCGCCGG TTAAGGCTAA ACTGAAAGGA CAAGTTTTGG 
TGACTGCGCT CCTCCAAGCC AGTTACCTCG GTTCAAAGAG TTGGTAGCTC AGAGAACCTT 
CGAAAAACCG CCCTGCAAGG CGGTTTTTTC GTTTTCAGAG CAAGAGATTA CGCGCAGACC 
AAAACGATCT CAAGAAGATC ATCTTATTAA TCAGATAAAA TATTTCTAGA TTTCAGTGCA 
ATTTATCTCT TCAAATGTAG CACCTGAAGT CAGCCCCATA CGATATAAGT TGTAATTCTC 
ATGTTTGACA GCTTATCATC GATAAGCTTT AATGCGGTAG TTTATCACAG TTAAATTGCT 
AACGCAGTCA GGCACCGTGT ATGAAATCTA ACAATGCGCT CATCGTCATC CTCGGCACCG 
TCACCCTGGA TGCTGTAGGC ATAGGCTTGG TTATGCCGGT ACTGCCGGGC CTCTTGCGGG 
ATATCGTCCA TTCCGACAGC ATCGCCAGTC ACTATGGCGT GCTGCTAGCG CTATATGCGT 
TGATGCAATT TCTATGCGCA CCCGTTCTCG GAGCACTGTC CGACCGCTTT GGCCGCCGCC 
CAGTCCTGCT CGCTTCGCTA CTTGGAGCCA CTATCGACTA CGCGATCATG GCGACCACAC 
CCGTCCTGTG GATCCGAGGA GAAATTAACC ATGCATAACC AGGCTCCAAT TCAACGTAGA 
AAATCAACAC GTATTTACGT TGGGAATGTG CCGATTGGCG ATGGTGCTCC CATCGCCGTA 
CAGTCCATGA CCAATACGCG TACGACAGAC GTCGAAGCAA CGGTCAATCA AATCAAGGCG 
CTGGAACGCG TTGGCGCTGA TATCGTCCGT GTATCCGTAC CGACGATGGA CGCGGCAGAA 
GCGTTCAAAC TCATCAAACA GCAGGTTAAC GTGCCGCTGG TGGCTGACAT CCACTTCGAC 
TATCGCATTG CGCTGAAAGT AGCGGAATAC GGCGTCGATT GTCTGCGTAT TAACCCTGGC 
AATATCGGTA ATGAAGAGCG TATTCGCATG GTGGTTGACT GTGCGCGCGA TAAAAACATT 
CCGATCCGTA TTGGCGTTAA CGCCGGATCG CTGGAAAAAG ATCTGCAAGA AAAGTATGGC 
GAACCGACGC CGCAGGCGTT GCTGGAATCT GCCATGCGTC ATGTTGATCA TCTCGATCGC 
CTGAACTTCG ATCAGTTCAA AGTCAGCGTG AAAGCGTCTG ACGTCTTCCT CGCTGTTGAG 
TCTTATCGTT TGCTGGCAAA ACAGATCGAT CAGCCGTTGC ATCTGGGGAT CACCGAAGCC 
GGTGGTGCGC GCAGCGGGGC AGTAAAATCC GCCATTGGTT TAGGTCTGCT GCTGTCTGAA 
GGCATCGGCG ACACGCTGCG CGTATCGCTG GCGGCCGATC CGGTCGAAGA GATCAAAGTC 
GGTTTCGATA TTTTGAAATC GCTGCGTATC CGTTCGCGAG GGATCAACTT CATCGCCTGC 
CCGACCTGTT CGCGTCAGGA ATTTGATGTT ATCGGTACGG TTAACGCGCT GGAGCAACGC 
CTGGAAGATA TCATCACTCC GATGGACGTT TCGATTATCG GCTGCGTGGT GAATGGCCCA 
GGTGAGGCGC TGGTTTCTAC ACTCGGCGTC ACCGGCGGCA ACAAGAAAAG C<5GCCTCTAT 
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GAAGATGGCG TGCGCAAAGA 
CGCATTCGTG CGAAAGCCAG 
GAAAAATAAG TCGACGAGGA 
TTTTGTGCCG GGGTAGACCG 
GCACCGATAT ATGTCCGTCA 
GAGCGTGGGG CTATCTTTAT 
TTCTCCGCAC ACGGTGTTTC 
GTGTTTGATG CCACCTGTCC 
CGCCGTGGCG AAGAATCTAT 
ATGGGCCAGT ACAGTAACCC 
TGGAAACTGA CGGTCAAAAA 
GTGGATGACA CGTCTGATGT 
CCGCGCAAAG ATGACATCTG 
GCAGAACAGG CGGAAGTTGT 
CTGGCGGAGC TGGCCCAGCG 
ATCCAGGAAG AGTGGGTGAA 
CCGGATATTC TGGTGCAGAA 
ATTCCGCTGG AAGGCCGTGA 
GATATTCGTG AAGTCGATTA 
CGCGAGGCTG GATGGCCTTC 
CCGCGTTGCA GGCCATGCTG 
GATCGCTCGC GGCTCTTACC 
TTTATGCCGC CTCGGCGAGC 
ACCTTGTCTG CCTCCCCGCG 
TGGAAGCCGG CGGCACCTCG 
CTTGCGGAGA ACTGTGAATG 
CATCTCCAGC AGCCGCACGC 
CATGATCGTG CTCCTGTCGT 
GCAGAATGAA TCACCGATAC 
GACCTGAGCA ACAACATGAA 
GAAGTCCCCT ACGTGCTGCT 
CACGATACTA TGACTGAGAG 
ACAGTCCGCA CCGCTGTCCG 
CGCACTTATG ACTGTCTTCT 
CAGTCCCCCG GCCACGGGGC 
TATGTTCCGG ATCTGCATCG 
ATTAACGAAG CGCTAACCGT 
TCAATTATTA CCTCCACGGG 
ACGGTCACAC TGCTTCCGGT 
ATTTAACGAC CCTGCCCTGA 
TCATCCGCTT ATTATCACTT 
GCCTTAAAAA AATTACGCCC 
CATTCTGCCG ACATGGAAGC 
CAGCACCTTG TCGCCTTGCG 
GTCCATATTG GCCACGTTTA 
GAAAAACATA TTCTCAATAA 
CACATCTTGC GAATATATGT 
CGATGAAAAC GTTTCAGTTT 
TATCACCAGC TCACCGTCTT 



CCGTCTGGAC AACAACGATA 
TCAGCTGGAC GAAGCGCGTC 
GAAATTAACC ATGCAGATCC 
CGCTATCAGC ATTGTTGAAA 
CGAAGTGGTA CATAACCGCT 
TGAGCAGATT AGCGAAGTAC 
TCAGGCGGTA CGTAACGAAG 
GCTGGTGACC AAAGTGCATA 
TCTCATCGGT CACGCCGGGC 
GGAAGGGGGA ATGTATCTGG 
CGAAGAGAAG CTCTCCTTTA 
GATCGACGCG CTGCGTAAAC 
CTACGCCACG ACTAACCGTC 
GTTGGTGGTC GGTTCGAAAA 
TATGGGCAAA CGCGCGTTTT 
AGAGGTTAAA TGCGTCGGCG 
TGTGGTGGCA CGTTTGCAGC 
AGAAAACATT GTTTTCGAAG 
ACGGCCGACG CGCTGGGCTA 
CCCATTATGA TTCTTCTCGC 
TCCAGGCAGG TAGATGACGA 
AGCCTAACTT CGATCACTGG 
ACATGGAACG GGTTGGCATG 
TTGCGTCGCG GTGCATGGAG 
CTAACGGATT CACCACTCCA 
CGCAAACCAA CCCTTGGCAG 
GGCGCATCTC GGGCAGCGTT 
TGAGGACCCG GCTAGGCTGG 
GCGAGCGAAC GTGAAGCGAC 
TGGTCTTCGG TTTCCGTGTT 
GAAGTTGCCC GCAACAGAGA 
TCAACGCCAT GAGCGGCCTC 
GTAGCTCCTT CCGGTGGGCG 
TTATCATGCA ACTCGTAGGA 
CTGCCACCAT ACCCACGCCG 
CAGGATGCTG CTGGCTACCC 
TTTTATCAGG CTCTGGGAGG 
GAGAGCCTGA GCAAACTGGC 
AGTCAATAAA CCGGTAAACC 
ACCGACGACC GGGTCGAATT 
ATTCAGGCGT AGCACCAGGC 
CGCCCTGCCA CTCATCGCAG 
CATCACAGAC GGCATGATGA 
TATAATATTT GCCCATGGTG 
AATCAAAACT GGTGAAACTC 
ACCCTTTAGG GAAATAGGCC 
GTAGAAACTG CCGGAAATCG 
GCTCATGGAA AACGGTGTAA 
TCATTGCCAT ACG 



TGATCGACCA GCTGGAAGCA 
GAATTGACGT TCAGCAGGTT 
TGTTGGCCAA CCCGCGTGGT 
ACGCGCTGGC CATTTACGGC 
ATGTGGTCGA TAGCTTGCGT 
CGGACGGCGC GATCCTGATT 
CAAAAAGTCG CGATTTGACG 
TGGAAGTCGC CCGCGCCAGT 
ACCCGGAAGT GGAAGGGACA 
TCGAATCGCC GGACGATGTG 
TGACCCAGAC CACGCTGTCG 
GCTTCCCGAA AATTGTCGGT 
AGGAAGCGGT ACGCGCCCTG 
ACTCCTCCAA CTCCAACCGT 
TGATTGACGA TGCGAAAGAC 
TGACTGCGGG CGCATCGGCT 
AGCTGGGCGG TGGTGAAGCC 
TGCCGAAAGA GCTGCGTGTC 
CGTCTTGCTG GCGTTCGCGA 
TTCCGGCGGC ATCGGGATGC 
CCATCAGGGA CAGCTTCAAG 
ACCGCTGATC GTCACGGCGA 
GATTGTAGGC GCCGCCCTAT 
CCGGGCCACC TCGACCTGAA 
AGAATTGGAG CCAATCAATT 
AACATATCCA TCGCGTCCGC 
GGGTCCTGGC CACGGGTGCG 
CGGGGTTGCC TTACTGGTTA 
TGCTGCTGCA AAACGTCTGC 
TCGTAAAGTC TGGAAACGCG 
GTGGAACCAA CCGGTGATAC 
ATTTCTTATT CTGAGTTACA 
CGGGGCATGA CTATCGTCGC 
CAGGTGCCGG CAGCGCCCAA 
AAACAAGCGC CCTGCACCAT 
TGTGGAACAC CTACATCTGT 
CAGTUVTA/^T GATCATATCG 
CTCAGGCATT TGAGAAGCAC 
AGCAATAGAC ATAAGCGGCT 
TGCTTTCGAA TTTCTGCCAT 
GTTTAAGGGC ACCAATAACT 
TACTGTTGTA ATTCATTAAG 
ACCTGAATCG CCAGCGGCAT 
AAAACGGGGG CGAAGAAGTT 
ACCCAGGGAT TGGCTGAGAC 
AGGTTTTCAC CGTAACACGC 
TCGTGGTATT CACTCCAGAG 
CAAGGGTGAA CACTATCCCA 
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Annex J 

A and corresponding amino acid sequence of the ispH (formerly lytB) gene from 

Escherichia coli 

10 20 30 40 50 60 70 

I I I I I I I 

ATGCAGATCCTGTTGGCCAACCCGCGTGGTTTTTGTGCCGGGGTAGACCGCGCTATCAGCATTGTTGAAAAC 
MQILLANPRGFCAGVDRAI SIVEN 

80 90 100 110 120 130 140 

I I I I I I I 

GCGCTGGCCATTTACGGCGCACCGATATATGTCCGTCACGAAGTGGTACATAACCGCTATGTGGTCGATAGC 
ALAI YGAPIY VRHEVVHNRYVVDS 

150 160 170 180 190 200 210 

I I I I I I I 

TTGCGTGAGCGTGGGGCTATCTTTATTGAGCAGATTAGCGAAGTACCGGACGGCGCGATCCTGATTTTCTCC 
LRERGAIFIEQISEVPDGAILIFS 

220 230 240 250 260 270 280 

I i I I I I I 

GCACACGGTGTTTCTCAGGCGGTACGTAACGAAGCAAAAAGTCGCGATTTGACGGTGTTTGATGCCACCTGT 
AHGVSQAVRNEAKSRDLTVFDATC 

290 300 310 320 330 340 350 360 

I I I I I I I I 

CCGCTGGTGACCAAAGTGCATATGGAAGTCGCCCGCGCCAGTCGCCGTGGCGAAGAATCTATTCTCATCGGT 
P LVTKV HMEVARASRRGEES ILIG 

370 380 390 400 410 420 430 

I I I 1 I I I 

CACGCCGGGCACCCGGAAGTGGAAGGGACAATGGGCCAGTACAGTAACCCGGAAGGGGGAATGTATCTGGTC 
HAGHPEVEGTMGQYSNPEGGMYLV 

440 450 460 470 480 490 500 

I I I I I I I 

GAATCGCCGGACGATGTGTGGAAACTGACGGTCAAAAACGAAGAGAAGCTCTCCTTTATGACCCAGACCACG 
ESPDDVWKLTVKNEEKLSFMTQTT 

510 520 530 540 550 560 570 

I I I I I I I 

CTGTCGGTGGATGACACGTCTGATGTGATCGACGCGCTGCGTAAACGCTTCCCGAAAATTGTCGGTCCGCGC 
LSVDDT SDVIDALRKRFPKIVGPR 

580 590 600 610 620 630 640 

I I I I I 1 I 

AAAGATGACATCTGCTACGCCACGACTAACCGTCAGGAAGCGGTACGCGCCCTGGCAGAACAGGCGGAAGTT 
KDDI CYATTNRQEAVRALAEQAEV 

650 660 670 680 690 700 710 720 

I I I I I I I I 

GTGTTGGTGGTCGGTTCGAAAAACTCCTCCAACTCCAACCGTCTGGCGGAGCTGGCCCAGCGTATGGGCAAA 
VLVVGSKNSSNSNRLAELAQRMGK 

730 740 750 760 770 780 790 

I I I I I I I 

CGCGCGTTTTTGATTGACGATGCGAAAGACATCCAGGAAGAGTGGGTGAAAGAGGTTAAATGCGTGGGCGTG 
RAFL IDDAKDI QEEWVKEVKCVGV 

800 810 820 830 840 850 860 

I I I I I I I 

ACTGCGGGCGCATCGGCTCCGGATATTCTGGTGCAGAATGTGGTGGCACGTTTGCAGCAGCTGGGCGGTGGT 
TAGAS APD I LVQNVVARLQQLGGG 

870 880 890 900 910 920 930 
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I I I I I I I 

GAAGCCATTCCGCTGGAAGGCCGTGAAGAAAACATTGTTTTCGAAGTGCCGAAAGAGCTGCGTGTCGATATT 
EAIPLBGREENIVFEVPKELRVDI 

940 950 

I I 

CGTGAAGTCGATTAA 
R E V D - 
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Annex K 

DNA Sequence of the plasmid construct pBScyclogcpElytB2 



ID PBSXICH2 PRELIMINARY; DNA; 9795 BP. 

SQ SEQUENCE 9795 BP; 2351 A; 2401 C; 2770 G; 2273 T; 0 OTHER; 
GTGGCACTTT TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT 
CAAATATGTA TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA 
GGAAGAGTAf GAGTATTCAA CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT 

' GCCTTCCTGT TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT 

TGGGTGCACG AGTGGGTTAC ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT 
TTCGCCCCGA AGAACGTTTT CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG 
TATTATCCCG TATTGACGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA 
ATGACTTGGT TGAGTACTCA CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA 
GAGAATTATG CAGTGCTGCC ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA 
CAACGATCGG AGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA 
CTCGCCTTGA TCGTTGGGAA CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA 
CCACGATGCC TGTAGCAATG GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA 
CTCTAGCTTC CCGGCAACAA TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC 
TTCTGCGCTC GGCCCTTCCG GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC 
GTGGGTCTCG CGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG 
TTATCTACAC GACGGGGAGT CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA 
TAGGTGCCTC ACTGATTAAG CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT 
AGATTGATTT AAAACTTCAT TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA 
ATCTCATGAC CAAAATCCCT TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG 
AAAAGATCAA AGGATCTTCT TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA 
CAAAAAAACC ACCGCTACCA GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT 
TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC 
CGTAGTTAGG CCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 
TCCTGTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA 
GACGATAGTT ACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC 
CCAGCTTGGA GCGAACGACC TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA 
GCGCCACGCT TCCCGAAGGG AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA 
CAGGAGAGCG CACGAGGGAG CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG 
GGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC 
TATGGAAAAA CGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG 
CTCACATGTT CTTTCCTGCG TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG 
AGTGAGCTGA TACCGCTCGC CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG 
AAGCGGAAGA GCGCCCAATA CGCAAACCGC CTCTCCCCGC GCGTTGGCCG ATTCATTAAT 
GCAGCTGGCA CGACAGGTTT CCCGACTGGA AAGCGGGCAG TGAGCGCAAC GCAATTAATG 
TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 
TGTGTGGAAT TGTGAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC CATGATTACG 
CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT GGAGCTCCAC CGCGGGAGGA 
GAAATTAACC ATGCATAACC AGGCTCCAAT TCAACGTAGA AAATCAACAC GTATTTACGT 
TGGGAATGTG CCGATTGGCG ATGGTGCTCC CATCGCCGTA CAGTCCATGA CCAATACGCG 
TACGACAGAC GTCGAAGCAA CGGTCAATCA AATCAAGGCG CTGGAACGCG TTGGCGCTGA 
TATCGTCCGT GTATCCGTAC CGACGATGGA CGCGGCAGAA GCGTTCAAAC TCATCAAACA 
GCAGGTTAAC GTGCCGCTGG TGGCTGACAT CCACTTCGAC TATCGCATTG CGCTGAAAGT 
AGCGGAATAC GGCGTCGATT GTCTGCGTAT TAACCCTGGC AATATCGGTA ATGAAGAGCG 
TATTCGCATG GTGGTTGACT GTGCGCGCGA TAAAAACATT CCGATCCGTA TTGGCGTTAA 
CGCCGGATCG .CTGGAAAAAG ATCTGCAAGA AAAGTATGGC GAACCGACGC CGCAGGCGTT 
GCTGGAATCT GCCATGCGTC ATGTTGATCA TCTCGATCGC CTGAACTTCG ATCAGTTCAA 
AGTCAGCGTG AAAGCGTCTG ACGTCTTCCT CGCTGTTGAG TCTTATCGTT TGCTGGCAAA 



wo 02/083720 PCT/EP02/04005 

30/39 



ACAGATCGAT CAGCCGTTGC ATCTGGGGAT 
AGTAAAATCC GCCATTGGTT TAGGTCTGCT 
CGTATCGCTG GCGGCCGATC CGGTCGAAGA 
GCTGCGTATC CGTTCGCGAG GGATCAACTT 
ATTTGATGTT ATCGGTACGG TTAACGCGCT 
GATGGACGTT TCGATTATCG GCTGCGTGGT 
ACTCGGCGTC ACCGGCGGCA ACAAGAAAAG 
CCGTCTGGAC AACAACGATA TGATCGACCA 
TCAGCTGGAC GAAGCGCGTC GAATTGACGT 
GAAATTAACC ATGCAGATCC TGTTGGCCAA 
CGCTATCAGC ATTGTTGAAA ACGCGCTGGC 
CGAAGTGGTA CATAACCGCT ATGTGGTCGA 
TGAGCAGATT AGCGAAGTAC CGGACGGCGC 
TCAGGCGGTA CGTAACGAAG CAAAAAGTCG 
GCTGGTGACC AAAGTGCATA TGGAAGTCGC 
TCTCATCGGT CACGCCGGGC ACCCGGAAGT 
GGAAGGGGGA ATGTATCTGG TCGAATCGCC 
CGAAGAGAAG CTCTCCTTTA TGACCCAGAC 
GATCGACGCG CTGCGTAAAC GCTTCCCGAA 
CTACGCCACG ACTAACCGTC AGGAAGCGGT 
GTTGGTGGTC GGTTCGAAAA ACTCCTCCAA 
TATGGGCAAA CGCGCGTTTT TGATTGACGA 
AGAGGTTAAA TGCGTCGGCG TGACTGCGGG 
TGTGGTGGCA CGTTTGCAGC AGCTGGGCGG 
AGAAAACATT GTTTTCGAAG TGCCGAAAGA 
AGCGGCCGCT CTAGAACTAG TGGATCCCCC 
ATGTATATCG GGATAGATCT TGGCACCTCG 
GGTGAGGTGG TTGCTGCGCA AACGGAAAAG 
TCGGT^CAAG ACCCGGAACA GTGGTGGCAG 
GATCAGCATT CTCTGCAGGA CGTTAAAGCA 
ACCTTGCTGG ATGCTCAGCA ACGGGTGTTA 
TGTGCGCAAG AGTGCACTTT GCTGGAAGCG 
AACCTGATGA TGCCCGGATT TACTGCGCCT 
GAGATATTCC GTCAAATCGA CAAAGTATTA 
ACGGGGGAGT TTGCCAGCGA TATGTCTGAC 
AAGCGTGACT GGAGTGACGT CATGCTGCAG 
GCATTATACG AAGGCAGCGA AATTACTGGT 
GGTATGGCGA CGGTGCCAGT TGTCGCAGGC 
GTGGGAATGG TTGATGCTAA TCAGGCAATG 
GCTGTCAGCG AAGGGTTCTT AAGCAAGCCA 
CTACCGCAAC GTTGGCATTT AATGTCTGTG 
GCCGCGAAAT TAACCGGCCT GAGCAATGTC 
GATGAAAGTG CCGAGCCAGT TTGGTTTCTG 
AATAATCCCC AGGCGAAGGG GGTTTTCTTT 
CTGGCGCGAG CAGTGCTGGA AGGCGTGGGT 
CATGCCTGCG GTATTAAACC GCAAAGTGTT 
TACTGGCGTC AGATGCTGGC GGATATCAGC 
GATGTGGGGC CAGCACTGGG CGCAGCAAGG 
TCGCTCATTG AATTGTTGCC GCAACTACCG 
CGTTATGCCG CTTATCAGCC ACGACGAGAA 
CCATTAATGG CGTAAAAGCT TGAGGAGAAA 
TCGACCGGCT CGATTGGTTG CAGCACGCTG 
CGCGTAGTTG CGCTGGTGGC AGGCAAAAAT 
TTCTCTCCCC GCTATGCCGT AATGGACGAT 
CTACAGCAAC AGGGTAGCCG CACCGAAGTC 



CACCGAAGCC GGTGGTGCGC GCAGCGGGGC 
GCTGTCTGAA GGCATCGGCG ACACGCTGCG 
GATCAAAGTC GGTTTCGATA TTTTGAAATC 
CATCGCCTGC CCGACCTGTT CGCGTCAGGA 
GGAGCAACGC CTGGAAGATA TCATCACTCC 
GAATGGCCCA GGTGAGGCGC TGGTTTCTAC 
CGGCCTCTAT GAAGATGGCG TGCGCAAAGA 
GCTGGAAGCA CGCATTCGTG CGAAAGCCAG 
TCAGCAGGTT GAAAAATAAG TCGACGAGGA 
CCCGCGTGGT TTTTGTGCCG GGGTAGACCG 
CATTTACGGC GCACCGATAT ATGTCCGTCA 
TAGCTTGCGT GAGCGTGGGG CTATCTTTAT 
GATCCTGATT TTCTCCGCAC ACGGTGTTTC 
CGATTTGACG GTGTTTGATG CCACCTGTCC 
CCGCGCCAGT CGCCGTGGCG AAGAATCTAT 
GGAAGGGACA ATGGGCCAGT ACAGTAACCC 
GGACGATGTG TGGAT^CTGA CGGTCAAAAA 
CACGCTGTCG GTGGATGACA CGTCTGATGT 
AATTGTCGGT CCGCGCAAAG ATGACATCTG 
ACGCGCCCTG GCAGAACAGG CGGAAGTTGT 
CTCCAACCGT CTGGCGGAGC TGGCCCAGCG 
TGCGAAAGAC ATCCAGGAAG AGTGGGTGAA 
CGCATCGGCT CCGGATATTC TGGTGCAGAA 
TGGTGAAGCC ATTCCGCTGG AAGGCCGTGA 
GCTGCGTGTC GATATTCGTG AAGTCGATTA 
GGGCTGCAGG AATTCGAGGA GAAATTAACC 
GGCGTAAAAG TTATTTTGCT CAACGAGCAG 
CTGACCGTTT CGCGCCCGCA TCCACTCTGG 
GCAACTGATC GCGCAATGAA AGCTCTGGGC 
TTGGGTATTG CCGGCCAGAT GCACGGAGCA 
CGCCCTGCCA TTTTGTGGAA CGACGGGCGC 
CGAGTTCCGC AATCGCGGGT GATTACCGGC 
AAATTGCTAT GGGTTCAGCG GCATGAGCCG 
TTACCGAAAG ATTACTTGCG TCTGCGTATG 
GCAGCTGGCA CCATGTGGCT GGATGTCGCA 
GCTTGCGACT TATCTCGTGA CCAGATGCCC 
GCTTTGTTAC CTGAAGTTGC GAAAGCGTGG 
GGTGGCGACA ATGCAGCTGG TGCAGTTGGT 
TTATCGCTGG GGACGTCGGG GGTCTATTTT 
GAAAGCGCCG TACATAGCTT TTGCCATGCG 
ATGCTGAGTG CAGCGTCGTG TCTGGATTGG 
CCAGCTTTAA TCGCTGCAGC TCAACAGGCT 
CCTTATCTTT CCGGCGAGCG TACGCCACAC 
GGTTTGACTC ATCAACATGG CCCCAATGAA 
TATGCGCTGG CAGATGGCAT GGATGTCGTG 
ACGTTGATTG GGGGCGGGGC GCGTAGTGAG 
GGTCAGCAGC TCGATTACCG TACGGGGGGG 
CTGGCGCAGA TCGCGGCGAA TCCAGAGAAA 
TTAGAACAGT CGCATCTACC AGATGCGCAG 
ACGTTCCGTC GCCTCTATCA GCAACTTCTG 
TTAACCATGA AGCAACTCAC CATTCTGGGC 
GACGTGGTGC GCCATAATCC CGAACACTTC 
GTCACTCGCA TGGTAGAACA GTGCCTGGT^ 
GAAGCGAGTG CGAAACTTCT TAAAACGATG 
TTAAGTGGGC AACAAGCCGC TTGCGATATG 
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GCAGCGCTTG AGGATGTTGA TCAGGTGATG 
CCTACGCTTG CTGCGATCCG CGCGGGTAAA 
GTTACCTGCG GACGTCTGTT TATGGACGCC 
GTCGATAGCG AACATAACGC CATTTTTCAG 
GGATACGCTG ACCTTGAGCA AAATGGCGTG 
CCTTTCCGTG AGACGCCATT GCGCGATTTG 
CATCCGAACT GGTCGATGGG GCGTAAAATT 
GGTCTGGAAT ACATTGAAGC GCGTTGGCTG 
CTGATTCACC CGCAGTCAGT GATTCACTCA 
GCGCAGCTGG GGGAACCGGA TATGCGTACG 
CGCGTGAACT CTGGCGTGAA GCCGCTCGAT 
GCACCGGATT ATGATCGTTA TCCATGCCTG 
CAGGCAGCGA CGACAGCATT GAATGCCGCA 
CAACAAATCC GCTTTACGGA TATCGCTGCG 
ATGCGCGAAC CACAATGTGT GGACGATGTG 
GCCAGAAAAG AGGTGATGCG TCTCGCAAGC 
ACCACTCATT TGGATGTTTG CGCCGTGGTT 
ACGGAATGTC CTAAGCAATA TCTCTCAATC 
CATGCGCTGC TGGCGCATCC CCGGGTGAAA 
AGCCGTTTTG CACAACTTCC TCTGGCGAAT 
GATGAGCGTG CCGATTCCGT GCTGGCAGGT 
TTGGTGCATG ACGCCGCTCG TCCTTGTTTG 
TTGAGCGAAA CCAGCCGCAC GGGGGGGATC 
CGTGCCGAAC CGGGCAAAAA TGCCATTGCT 
GCGCTGACGC CGCAATTTTT CCCTCGTGAG 
AATGAAGGCG CGACTATTAC CGACGAAGCC 
CAGTTGGTCG AAGGCCGTGC GGATAACATT 
GCCGAGTTTT ACCTCACCCG AACCATCCAT 
GTTTTGACGT ACATGCCTTT GGCGGTGAAG 
CTTACGAAAA AGGATTGCTG GCGCATTCTG 
ATGCATTGCT TGGCGCGGCG GCGCTGGGGG 
CGGCATTTAA AGGTGCCGAT AGCCGCGAGC 
CGAAGGGTTA TACCCTTGGC AACGTCGATG 
TGCCGCACAT TCCACAAATG CGCGTGTTTA 
ATGTTAACGT GAAAGCCACT ACTACGGAAA 
TTGCCTGTGA AGCGGTGGCG CTACTCATTA 
TAACCATGCG GACACAGTGG CCCTCTCCGG 
GTCAGCGTGC GGATGGTTAC CACACGCTGC 
ACACCATCAG CATTGAGCTT CGTGACGATG 
GCGTGGAACA TGAAGATAAC CTGATCGTTC 
CAGACAGCGG GCGTCTTCCG ACGGGAAGCG 
CGATGGGCGG CGGTCTCGGC GGTGGTTCAT 
ATCATCTCTG GCAATGCGGG CTAAGCATGG 
GCGCAGATGT TCCTGTCTTT GTTCGGGGGC 
TACTAACGCC GGTGGATCCG CCAGAGAAGT 
TTCCGACTCC GGTGATTTTT AAAGATCCTG 
TAGAAACGTT GCTAAAATGT GAATTCAGCA 
TTCGCGAGGT TGATGCGGTG CTTTCCTGGC 
GGACAGGGGC CTGTGTCTTT GCTGAATTTG 
AGCAAGCCCC GGAATGGCTC AATGGCTTTG 
ACAGAGCCAT GCTTTAAGGT ACCCAATTCG 
CTGGCCGTCG TTTTACAACG TCGTGACTGG 
CTTGCAGCAC * ATCCCCCTTT CGCCAGCTGG 
CCTTCCCAAC AGTTGCGCAG CCTGAATGGC 
TAAAATTCGC GTTAAATTTT TGTTAAATCA 



GCAGCCATTG TTGGCGCTGC TGGGCTGTTA 
ACCATTTTGC TGGCCAATAA AGAATCACTG 
GTAAAGCAGA GCAAAGCGCA ATTGTTACCG 
AGTTTACCGC AACCTATCCA GCATAATCTG 
GTGTCCATTT TACTTACCGG GTCTGGTGGC 
GCAACAATGA CGCCGGATCA AGCCTGCCGT 
TCTGTCGATT CGGCTACCAT GATGAACAAA 
TTTAACGCCA GCGCCAGCCA GATGGAAGTG 
ATGGTGCGCT ATCAGGACGG CAGTGTTCTG 
CCAATTGCCC ACACCATGGC ATGGCCGAAT 
TTTTGCAAAC TAAGTGCGTT GACATTTGCC 
AAACTGGCGA TGGAGGCGTT CGAACAAGGC 
AACGAAATCA CCGTTGCTGC TTTTCTTGCG 
TTGAATTTAT CCGTACTGGA AAAAATGGAT 
TTATCTGTTG ATGCGAACGC GCGTGAAGTC 
TGAGTCGACG AGGAGAAATT AACCATGGCA 
CCGGCGGCCG GATTTGGCCG TCGAATGCAA 
GGTT^TCAAA CCATTCTTGA ACACTCGGTG 
CGTGTCGTCA TTGCCATT^G TCCTGGCGAT 
CATCCGCAAA TCACCGTTGT AGATGGCGGT 
CTGAAAGCCG CTGGCGACGC GCAGTGGGTA 
CATCAGGATG ACCTCGCGCG ATTGTTGGCG 
CTCGCGGCAC CAGTGCGCGA TACTATGAAA 
CATACCGTTG ATCGCAACGG CTTATGGCAC 
CTGTTACATG ACTGTCTGAC GCGCGCTCTA 
tCGGCGCTGG AATATTGCGG ATTCCATCCT 
AAAGTCACGC GCCCGGAAGA TTTGGCACTG 
CAGGAGAATA CATAATGCGA ATTGGACACG 
GCCCAATTAT CATTGGTGGC GTACGCATTC 
ATGGCGACGT GGCGCTCCAT GCGTTGACCG 
ATATCGGCAA GCTGTTCCCG GATACCGATC 
TGCTACGCGA AGCCTGGCGT CGTATTCAGG 
TCACTATCAT CGCTCAGGCA CCGAAGATGT 
TTGCCGAAGA TCtCGGCTGC CATATGGATG 
AACTGGGATT TACCGGACGT GGGGAAGGGA 
AGGCAACAAA ATGACTCGAG GAGGAGAAAT 
CAAAACTTAA TCTGTTTTTA TACATTACCG 
AAACGCTGTT TCAGTTTCTT GATTACGGCG 
GGGATATTCG TCTGTTAACG CCCGTTGAAG 
GCGCAGCGCG ATTGTTGATG AAAACTGCGG 
GTGCGAATAT CAGCATTGAC AAGCGTTTGC 
CCAATGCCGC GACGGTCCTG GTGGCATTAA 
ATGAGCTGGC GGAAATGGGG CTGACGCTGG 
ATGCCGCGTT TGCCGAAGGC GTTGGTGAAA 
GGTATCTGGT GGCGCACCCT GGTGTAAGTA 
T^CTCCCGCG CAATACGCCA AAAAGGTCAA 
ATGATTGCGA GGTTATCGCA AGAAAACGTT 
TGTTAGAATA CGCCCCGTCG CGCCTGACTG 
ATACAGAGTC TGAAGCCCGC CAGGTGCTAG 
TGGCGAAAGG CGCTAATCTT TCCCCATTGC 
CCCTATAGTG AGTCGTATTA CGCGCGCTCA 
GAAAACCCTG GCGTTACCCA ACTTAATCGC 
CGTAATAGCG AAGAGGCCCG CACCGATCGC 
GAATGGAAAT TGTAAGCGTT AATATTTTGT 
GCTCATTTTT TAACCAATAG GCCGAAATCG 
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GCAAAATCCC TTATAAATCA 
GGAACAAGAG TCCACTATTA 
ATCAGGGCGA TGGCCCACTA 
GCCGTAAAGC ACTAAATCGG 
AGCCGGCGAA CGTGGCGAGA 
TGGCAAGTGT AGCGGTCACG 
TACAGGGCGC GTCAG 



AAAGAATAGA CCGAGATAGG 
AAGAACGTGG ACTCCAACGT 
CGTGAACCAT CACCCTAATC 
AACCCTAAAG GGAGCCCCCG 
AAGGAAGGGA AGAAAGCGAA 
CTGCGCGTAA CCACCACACC 



GTTGAGTGTT GTTCCAGTTT 
CAAAGGGCGA AAAACCGTCT 
AAGTTTTTTG GGGTCGAGGT 
ATTTAGAGCT TGACGGGGAA 
AGGAGCGGGC GCTAGGGCGC 
CGCCGCGCTT AATGCGCCGC 
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Annex L 

DNA and corresponding amino acid sequence of the ispG gene (fragment) from 

Arabidopsis thaliana 



10 20 30 40 50 60 70 

I I - I i I I I 

>AAGACGGTGAGAAGGAAGACTCGTACTGTTATGGTTGGAAATGTCGCCCTTGGAAGCGAACATCCGATAAGG 
KTVRRKTRTVMVGNVALGS EHP IR- 

80 90 100 110 120 130 140 

I I i I I I I 

ATTCAAACGATGACTACTTCGGATACAAAAGATATTACTGGAACTGTTGATGAGGTTATGAGAATAGCGGAT 
IQTMTTSDTKDITGTVDEVMRIAD 

150 160 170 180 190 200 210 

I I I I I I I 

AAAGGAGCTGATATTGTAAGGATAACTGTTCAAGGGAAGAAAGAGGCGGATGCGTGCTTTGAAATAAAAGAT 
KGADIVRITV Q G KKEADACFEIKD 

220 230 240 250 260 270 280 

I I I I I I I 

AAACTCGTTCAGCTTAATTACAATACACCGCTGGTTGCAGGTATTCATTTTGCCCCTACTGTAGCCTTACGA 
KLVQLNYNTPLVAGIHFAPTVALR 

290 300 310 320 330 340 350 360 

I I I I I I I I 

GTCGCTGAATGCTTTGACAAGATCCGTGTCAACCCCGGAAATTTTGCGGACAGGCGGGCCCAGTTTGAGACG 
VAECFDKIRVNPGNFADRRAQFET 

370 380 390 400 410 420 430 

I I I I I I I 

ATAGATTATACAGAAGATGAATATCAGAAAGAACTCCAGCATATCGAGCAG6TCTTCACTCCTTTGGTTGAG 
IDYTEDEYQKELQHIEQVFTPIiVE 

440 450 460 470 480 490 500 

I I I I I I I 

AAATGCAAAAAGTACGGGAGAGCAATGCGTATTGGGACAAATCATGGAAGTCTTTCTGACCGTATCATGAGC 
KCKKYGRAMRXGTNHGSLSDRIMS 

510 520 530 540 550 560 570 

I I I I I I I 

TATTACGGGGATTCTCCCCGAGGAATGGTTGAATCTGCGTTTGAGTTTGCAAGAATATGTCGGAAATTAGAC 
YY GDSPRGMVBS AFEFARICRKLD 

580 590 600 610 620 630 640 

I I I I I I I 

TAT<aiCAACrTTGTTTTCTCAATGAAAGCGAGaACCCAGTGATCATGGTCaiGGCGTACC^ 

YHNPVFSMKASNPVIMVQAYRLLV 

650 660 670 680 690 700 710 720 

I I I I I I I I 

GCTGAGATGTATGTTCATGGATGGGATTATCCTTTGCATTTGGGAGTTACTGAGGCAGGAGAAGGCGAAGAT 
AE MYVHGWDYPLHLGVTEAGEGED 

730 740 750 760 770 780 790 

I I I I I I I 

GGACGGATGAAATCTGCGATTGGAATTGGGACGCTTCTTCAGGACGGGCTCGGTGACACAACAAGAGTTTCA 
GRMKSAIGIGTLLQDGLGDTTRVS 

800 810 820 830 840 850 860 

I I I I I I I 

CTGACGGAGCCACCAGAAGAGGAGATAGATCCCTGCAGGCGATTGGCTAACCTCGGGACAAAAGCT6CCAAA 
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LTEPPEEEIDPCRRLANLGTKAAK 

870 880 890 900 910 920 930 

I I I I I I I 

CTTCAACAAGGCGCTGCACCXSTTTGAAGAAAAGaVTAGGCATTACTTTGATTTTCAGaSTCGGACGGGT^ 
LQQGAAPFEEKHRHYFDFQRRTGD 

940 950 960 970 980 990 1000 

I I I I I I I 

CTACCTGTACAAAAAGAGGGAGAAGAGGTTGATTACAGAAATGTCCTTCACCGTGATGGTTCTGTTCTGATG 
LPVQKEGEEVDYRNVLHRDGSVLM 

1010 1020 1030 1040 1050 1060 1070 1080 

I I I I I I I I 

TCGATTTCTCTGGATCAACTAAAGGCy^CCrGAACTCCTCTACAGATCACTCGCCACAAAGCTTGTCGTGGG^ 
S I SLDQL KAPELLYRSLATKIiVVG 

1090 1100 1110 1120 1130 1140 1150 

I I I I I I I 

ATGCCATTCAAGGATCrGGCAACrGTTGATTCAATCTTATTAAGAGAGCTACCGCCTGTAGATGATCAA 
MPPKDLATVDSILLRELPPVDDQV 

1160 1170 1180 1190 1200 1210 1220 

I I I I I I I 

GCTCGTTTGGCTCTCAAACGGTTGATTGATGTO^GTATGGGAGTTATAGCACCTTTATCAGAGCAACTAACA 
ARLAliKRLIDVSMGVIAPIiSEQLT 

1230 1240 1250 1260 1270 1280 1290 

I I I I I I I 

AAGCCATTGCCCJUVTGCC7VTGGTTCTTGTCAACCTCAAGGAACTATCTGGTGGCGCTTACAAGCTTCTCCCT 
KPLPNAMVLVNLKELSGGAYKLLP 

1300 1310 1320 1330 1340 1350 1360 

I I I I I I I 

GAAGGTACACGCTTGGTTGTCTCTCTACGAGGCGATGAGCCTTACGAGGAGCTTGAAATACTCAAAAACATT 
EGTRLVVSLRGDEPYEELEILKNI 

1370 1380 1390 1400 1410 1420 1430 1440 

I I I I I I I I 

GATGCTACTATGATTCTCCATGATGTACCTTTCACTGAAGACAAAGTTAGCAGAGTACATGCAGCTCGGAGG 
DATMZLHDVPFTEDKVSRVHAARR 

1450 1460 1470 1480 1490 1500 ISIO 

I I I I I I I 

CTATTCGAGTTCTTATCCGAGAATTCAGTTAACTTTCCTGTTATTCATCGCATAAACTTCCCAACCGGAATC 
LFEFLSENSVNFPVIHRINFPTGI 

1520 1530 1540 1550 1560 1570 15B0 

I I I I I I I 

CACAGAGACGAATTGGTGATTCATGO^GGGACATATGCTGGAGGCCTTCTTGTGGATGGACTAGGTGATGGC 
HRDELVIHAGTYAGGZiLVDGLGDG 

1590 1600 1610 1620 1630 1640 1650 

I I i I I I I 

GTAATGCTCGAAGCACCTGACCAAGATTTTGATTTTCTTAGGAATACTTCCTTCAACTTAOT 
VMLEAPDQDFDFLRNTSFNLLQGC 

1660 1670 1680 1690 1700 1710 1720 

I I I I I I I 

AGAATGCGTAACACTAAGACGGAATATGTATCGTGCCCGTCTTGTGGAAGAACGCTTTTCGACTTGCAAGAA 
RMRNTKTEYVSCPSCGRTLFDLQB 

1730 1740 1750 1760 1770 1780 1790 1800 

I I I I I I I I 

ATCAGCGCCGAGATCCGAGAAAAGACTTCCCATTTACCTGGCGTTTCGATCGCAATCATGGGATGCATTGTG 
I SAE I REKTSHLPGVS I AIMGC I V 



1810 1820 1830 1840 1850 1860 1870 
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I I I I I I I 

AATGGACCAGGAGAAATGGCAGATGCTGATTTCGGATATGTAGGTGGTTCTCCCGGAAAAATCGACCTTTAT 
NGPGEMADADFGYVGGSPGKZDLY 

1880 1890 1900 1910 1920 1930 1940 

I I I I I I I 

GTCGGAAAGACGGTGGTGAAGCGTG6GATAGCTATGACGGAGGCAACAGATGCTCTGATCGGTCTGATCAAA 
VGKTVVKRGIAMTEATDALIGLI K 

1950 1960 1970 1980 

I I I I 

GAACATGGTCGTT6GGTCGACCCGCCCGTGGCTGATGAGTAG 
EHGRWVDPPVADE - 
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Annex M 

DNA and corresponding amino acid sequence of the ispG (forrmly gcpE) gene of 

Arabidopsis thaliana 

ATGGCGACTGGAGTATTGCCAGCTCCGGTTTCTGGGATCAAGATACCGGATTCGAAAGTC 60 
MATGVLPAPVSGIKIPDSKV 20 

GGGTTTGGTAAAAGCATGAATCTTGTGAGAATTTGTGATGTTAGGAGTCTAAGATCTGCT 120 
GFGKSMNLVRICDVRSLRSA 40 

AGGAGAAGAGTTTCGGTTATCCGGAATTCAAACCAAGGCTCTGATTTAGCTGAGCTTCAA 180 
RRRVSVXRNSNQGSDLAELQ 60 

CCTGCATCCGAAG6AAGCCCTCTCTTAGTGCCAAGACAGAAATATTGTGAATCATTGCAT 240 
PASEGSPLLVPRQKYCESLR 80 

AAGACGGTGAGAAGGAAGACTCGTACTGTTATGGTTGGAAATGTCGCCCTTGGAAGCGAA 300 
KTVRRKTRTVMVGNVAZiGS E 100 

CATCCGATAAGGATTCAAACGATGACTACTTCGGATACAAAAGATATTACTGGAACTGTT 360 
HPIRIQTMTTSDTKDITGTV 120 

GATGAGGTTATGAGAATAGCGGATAAAGGAGCTGATATTGTAAGGATAACTGTTCAAGGG 420 
DEVHRXADKGADIVRITVQ6 140 

AAGAAAGAGGCGGATGCGTGCTTTGAAATAAAAGATAAACTCGTTCAGCTTAATTACAAT 480 
KKEADACFEIKDKLVQLNYN 160 

ACACCGCTGGTTGCAGGTATTCATTTTGCCCCTACTGTAGCCTTACGAGTCGCTGAATGC 540 
TPIiVAGXHFAPTVALRVAEC 180 

TTTGACAAGATCCGTGTCAACCCCGGAAATTTTGCGGACAGGCGGGCCCAGTTTGAGACG 600 
FDKIRVNPGNFADRRAQFET 200 

ATAGATTATACAGAAGATGAATATCAGAAAGAACTCCAGCATATCGAGCAGGTCTTCACT 660 
XDYTEDBYQKBLQHIEQVFT 220 

CCTTTGGTTGAGAAATGCAAAAAGTACGGGAGAGCAATGCGTATTGGGACAAATCATGGA 720 
PLVEKCKKYGRAMRX6TNHG 240 

AGTCTTTCTGACCGTATCATGAGCTATTACGGGGATTCTCCCCGAGGAATGGTTGAATCT 780 
SLSDRXHSYY6DSPR6MVES 260 

GCGTTTGAGTTTGCAAGAATATGTCGGAAATTAGACTATCACAACTTTGTTTTCTCAATG 84 0 
AFEFARXCRKLDYHNFVFSM 280 

AAAGCGAGCAACCCAGTGATCATGGTCCAGGCGTACCGTTTACTTGTGGCTGAGATGTAT 900 
KASNPVIMVQAYRLLVAEMY 300 

GTTCATGGATGGGATTATCCTTTGCATTTGGGAGTTACTGAGGCAGGAGAAGGCGAAGAT 960 
VHGWD YPLHLGVTEAGEGED 320 



GGACGGATGAAATCTGCGATTGGAATTGGGACGCTTCTTCAGGACGGGCTCGGTGACACA 1020 
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GRMKSAX6ZGTLLQDGL6DT 

ACAAGAGTTTCACTGACGGAGCCACCAGAAGAGGAGATAGATCCCTGCAGGCGATTGGCT 1080 
TRVSIiTEPPEEEIDPCRRLA 360 

AACCTCGGGACAAAAGCTGCCAAACTTCAACAAGGCGCTGCACCGTTTGAAGAAAAGCAT 114 0 
KLGTKAAKLQQGAA PFEEKH 380 



AGGCATTACTTTGATTTTCAGCGTCGGACGGGTGATCTACCTGTACAAAAAGAGGGAGAA 1200 
RHYFDFQRRTGDLPVQKEGE 400 

GAGGTTGATTACAGAAATGTCCTTCACCGTGATGGTTCTGTTCTGATGTCGATTTCTCTG 1260 
EVDYRNVLHRDGSVIiMS ISL 420 

GATCAACTAAAGGCACCTGAACTCCTCTACAGATCACTCGCCACAAAGCTTGTCGTGGGT 1320 
DQLKAPELLYRSLATKLVVG 440 

ATGCCATTCAAGGATCTGGCAACTGTTGATTCAATCTTATTAAGAGAGCTACCGCCTGTA 1380 
MPFKDLATVDSZLLREX.P PV 460 

GATGATCAAGTGGCTCGTTTGGCTCTCAAACGGTTGATTGATGTCAGTATGGGAGTTATA 144 0 
DDQVARLALKRLIDVSMGVX 480 

GCACCTTTATCAGAGCAACTAACAAAGCCATTGCCCAATGCCATGGTTCTTGTCAACCTC 1500 
APIiSEQLTKPIiPN AMVLVNL 500 

AAGGAACTATCTGGTGGCGCTTACAAGCTTCTCCCTGAAGGTACACGCTTGGTTGTCTCT 1560 
KELSGGAYKLIiPBGTRLVVS 520 

CTACGAGGCGATGAGCCTTACGAGGAGCTTGAAATACTCAAAAACATTGATGCTACTATG 1620 
liRGDEPYEELEXLKNIDATM 540 

ATTCTCCATGATGTACCTTTCACTGAAGACAAAGTTAGCAGAGTAC ATGCAGCTCGGAGG 1680 
IliHDVPFTEDKVSRVHAARR 560 

CTATTCGAGTTCTTATCCGAGAATTCAGTTAACTTTCCTGTTATTCATCGCATAAACTTC 174 0 
XiFEFLSENSVNFPVIHRZNF 580 

CCAACCGGAATCCACAGAGACGAATTGGTGATTCATGCAGGGACATATGCTGGAGGCCTT 1800 
PTGZHRDELVIHAGTYAG6L 600 

CTTGTGGATGGACTAGGTGATGGCGTAATGCTCGAAGCACCTGACCAAGATTTTGATTTT 1860 
LVDGLGD6VML EAPDQDFDF 620 

CTTAGGAATACTTCCTTCAACTTATTACAAGGATGC AGAATGCGTAACACTAAGACGGAA 1920 
LRN TSFNIiliQGCRMRNTKTE 640 

TATGTATCGTGCCCGTCTTGTGGAAGAACGCTTTTCGACTTGCAAGAAATCAGCGCCGAG 198 0 
YVSCPSCGRTLFDLQEZSAE 660 

ATCCGAGAAAAGACTTCCCATTTACCTGGCGTTTCGATCGCAATCATGGGATGCATTGTG 2040 
ZREKT SHLPGVSZAZMGCZV 680 

AATGGACCAGGAGAAATGGCAGATGCTGATTTCGGATATGTAGGTGGTTCTCCCGGAAAA 2100 
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N6P6EMADADF6YV66SP6K 700 

ATCGACCTTTATGTCGGAAAGACGGTGGTGAAGCGTGGGATAGCTATGACGGAGGCAACA 2160 
IDLYVGRTVVKR6IAMTEAT 720 

GATGCTCTGATCGGTCTGATCAAAGAACATGGTCGTTGGGTCGACCCGCCCGTGGCTGAT 2220 
DAI.ZGLXKEHGRWVDPPVAD 740 



GAGTAG 2226 
E - 741 
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Annex N 

cDNA sequence of 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase (IspH) 

from Arabidopsis thaliana 

ATGGCTGTTGCGCTCCAATTCAGCCGATTATGCGTTCGACCGGATACTTTCGTGCGGGAGAATCATCTCTCT 72 

MAVALQFS RLCVRPDTFVREMHI.S 24 

GGATCCGGATCTCTCCGCCGCCGGAAAGCTTTATCAGTCCGGTGCTCGTCTGGCGATGAGAACGCTCCTTCG 144 

GS6SLRRRKALSVRCSSGDENAPS 48 

CCATCGGTGGTGATGGACTCCGATTTCGACGCCAAGGTGTTCCGTAAGAACTTGACGAGAAGCGATAATTAC 216 

PSVVMDSDFDAKVFRKNLTRSDNY 72 

AATCGTAAAGGGTTCGGTCATAAGGAGGAGACACTCAAGCTCATGAATCGAGAGTACACCAGTGATATATTG 288 

NRRGFGHKEETLKLMNREYTSDXL 96 

GAGACACTGAAAACAAATGGGTATACTTATTCTTGGGGAGATGTTACTGTGAAACTCGCTAAAGCATATGGT 360 

ETLKTNGYTYSW GDVTVKIiA KAYG 120 

TTTTGCTGGGGTGTTGAGCGTGCTGTTCAGATTGCATATGAAGCACGAAAGCAGTTTCCAGAGGAGAGGCTT 4 32 

FCWGVERAVQIAYEARKQFPEERL 144 

TGGATTACTAACGAAATCATTC ATAACCCGACCGTCAAT AAGAGGTTGGAAGATATGGATGTTAAAATTATT 504 

WI TNE X I HNP TVNKR L EDHDV K Z Z 168 

CCGGTTGAGGATTCAAAGAAACAGTTTGATGTAGTAGAGAAAGATGATGTGGTTATCCTTCCTGCGTTTGGA 576 

PVEDSKKQFDVVEKDDVVILPAFG 192 

GCTGGTGTTGACGAGATGTATGTTCTTAATGATAAAAAGGTGCAAATTGTTGACACGACTTGTCCTTGGGTG 648 

AGVDEMYVL. NDKKVQ ZVDTTC PWV 216 

ACAAAGGTCTGGAACACGGTTGAGAAGCACAAGAAGGGGGAATACACATCAGTAATCCATGGTAAATATAAT 720 

TKVWMTVEKHKKGEYTSVZHGKYN 240 

CATGAAGAGACGATTGCAACTGCGTCTTTTGCAGGAAAGTACATCATTGTAAAGAACATGAAAGAGGCAAAT 792 

HEETZATASFA6KYZXVKNMKEAN 264 

TACGTTTGTGATTACATTCTCGGTGGCCAATACGATGGATCTAGCTCCACAAAAGAGGAGTTCATGGAGAAA 864 

YVCDYZL6GQYDGSSSTKEEFMEK 288 

TTCAAATACGCAATTTCGAAGGGTTTCGATCCCGACAATGACCTTGTCAAAGTTGGTATTGCAAACCAAACA 936 

FKYAX SKGFDP DNDLVKVGIANQT 312 

ACGATGCTAAAGGGAGAAACAGAGGAGATAGGAAGATTACTCGAGACAACAATGATGCGCAAGTATGGAGTG 1008 

TMZ.KG ETEEZ6RLI1ETTHMRRYGV 336 

GAAAATGTAAGCGGACATTTCATCAGCTTCAACACAATATGCGACGCTACTCAAGAGCGACAAGACGCAATC 1080 

ENVSGHFISFNTICDATQERQDAI 360 

TATGAGCTAGTGGAAGAGAAGATTGACCTCATGCTAGTGGTTGGCGGATGGAATTCAAGTAACACCTCTCAC 1152 

YELVEEKZDLHLVVG6WMSSNTSH 384 

CTTCAGGAAATCTCAGAGGCACGGGGAATCCCATCTTACTGGATCGATAGTGAGAAACGGATAGGACCTGGG 1224 

liQEZSEARGZPSYWXDSEKRZGPG 408 

AATAAAATAGCCTATAAGCTCCACTATGGAGAACTGGTCGAGAAGGAAAACTTTCTCCCAAAGGGACCAATA 1296 

NRXAYKLHYGELVEKENFZ.PK6PZ 432 

ACAATCGGTGTGACATCAGGTGCATCAACCCCGGATAAGGTCGTGGAAGATGCTTTGGTGAAGGTGTTCGAC 1368 

TZGVT .SGASTPDKVVEDALVKVFD 456 
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